大贤者
精华
|
战斗力 鹅
|
回帖 0
注册时间 2020-4-13
|
这跟收费有半毛钱关系 pymupdf我之前也用过 能直接把图片抽出来逻辑也很简单 下面几句话就能提了
- def pdf2Images(src_path:str, dst_path:str) -> bool:
- """pdf2pic pdf转换为图片
- :param src_path: 用于转换的pdf路径
- :param dst_path: 转换后图片的保存路径
- :return bool: 是否成功根据路径读取文件
- """
- try:
- doc = fitz.open(src_path)
- imgcount = 0
- lenXREF = doc.xref_length()
-
- for i in range(1, lenXREF):
- text = doc.xref_object(i)
- isXObject = re.search(r"/Type(?= */XObject)", text)
- isImage = re.search(r"/Subtype(?= */Image)", text)
- if not isXObject or not isImage:
- continue
- imgcount += 1
- pix = fitz.Pixmap(doc, i)
- new_name = f"{i}.png"
- if pix.n < 5:
- pix.save(os.path.join(dst_path, new_name))
- else:
- pix0 = fitz.Pixmap(fitz.csRGB, pix)
- pix0.save(os.path.join(dst_path, new_name))
- pix0 = None
- pix = None
- return True
- except:
- return False
复制代码
|
|