如何使用 Wand 从二进制字符串创建高分辨率 JPEG
How to create high res JPEG with Wand from binary string
我正在尝试使用 imagemagick 将一些 PDF 转换为高分辨率 jpeg。我正在使用 python 3.62 - 64 位和 wand 0.4.4 开发 win 10、64。在命令行我有:
$ /e/ImageMagick-6.9.9-Q16-HDRI/convert.exe -density 400 myfile.pdf -scale 2000x1000 test3.jpg.
这对我来说效果很好。
在python中:
from wand.image import Image
file_path = os.path.dirname(os.path.abspath(__file__))+os.sep+"myfile.pdf"
with Image(filename=file_path, resolution=400) as image:
image.save()
image_jpeg = image.convert('jpeg')
这是给我低分辨率的 JPEG。我如何将其翻译成我的魔杖代码来做同样的事情?
编辑:
我意识到问题是输入的 pdf 必须作为二进制字符串读入 Image 对象,所以基于 http://docs.wand-py.org/en/0.4.4/guide/read.html#read-blob 我试过:
with open(file_path,'rb') as f:
image_binary = f.read()
f.close()
with Image(blob=image_binary,resolution=400) as img:
img.transform('2000x1000', '100%')
img.make_blob('jpeg')
img.save(filename='out.jpg')
这读取文件没问题,但输出被拆分成 10 个文件。为什么?我需要将其转换为 1 个高分辨率 jpeg。
编辑:
我需要将 jpeg 发送到 OCR api,所以我想知道是否可以将输出写入对象之类的文件。看着https://www.imagemagick.org/api/magick-image.php#MagickWriteImageFile,我试过了:
emptyFile = Image(width=1500, height=2000)
with Image(filename=file_path, resolution=400) as image:
library.MagickResetIterator(image.wand)
# Call C-API Append method.
resource_pointer = library.MagickAppendImages(image.wand,
True)
library.MagickWriteImagesFile(resource_pointer,emptyFile)
这给出:
File "E:/ENVS/r3/pdfminer.six/ocr_space.py", line 113, in <module>
test_file = ocr_stream(filename='test4.jpg')
File "E:/ENVS/r3/pdfminer.six/ocr_space.py", line 96, in ocr_stream
library.MagickWriteImagesFile(resource_pointer,emptyFile)
ctypes.ArgumentError: argument 2: <class 'TypeError'>: wrong type
我怎样才能让它工作?
怎么样:
ok = Image(filename=file_path, resolution=400)
with ok.transform('2000x1000', '100%') as image:
image.compression_quality = 100
image.save()
或:
with ok.resize(2000, 1000)
相关:
- https://github.com/dahlia/wand/blob/13c4f544bd271fe298ac8dde44fbf178b349361a/docs/guide/resizecrop.rst
- Python 3 Wand How to make an unanimated gif from multiple PDF pages
Why? I need to get this into 1 high res jpeg.
PDF 包含 ImageMagick 在 "stack" 中考虑单个图像的页面。 wand 库提供了一个 wand.image.Image.sequance
来处理每个页面。
但是,要将 所有 图像附加到单个 JPEG 中。您可以遍历每个页面并将它们拼接在一起,或者调用 C-API 的方法 MagickAppendImages
.
from wand.image import Image
from wand.api import library
import ctypes
# Map C-API not provided by wand library.
library.MagickAppendImages.argtypes = [ctypes.c_void_p, ctypes.c_int]
library.MagickAppendImages.restype = ctypes.c_void_p
with Image(filename="path_to_document.pdf", resolution=400) as image:
# Do all your preprocessing first
# Ether word directly on the wand instance, or iterate over each page.
# ...
# To write all "pages" into a single image.
# Reset the stack iterator.
library.MagickResetIterator(image.wand)
# Call C-API Append method.
resource_pointer = library.MagickAppendImages(image.wand,
True)
# Write C resource directly to disk.
library.MagickWriteImages(resource_pointer,
"output.jpeg".encode("ASCII"),
False)
更新:
I need to send the jpeg to an OCR api ...
假设您使用 OpenCV 的 python API,您只需要遍历每一页,并通过 numpy 缓冲区将图像文件数据传递给 OCR。
from wand.image import Image
import numpy
import cv2
def ocr_process(file_data_buffer):
""" Replace with whatever your OCR-API calls for """
mat_instance = cv2.imdecode(file_data_buffer)
# ... work ...
source_image="path_to_document.pdf"
with Image(filename=source_image, resolution=400) as img:
for page in img.sequence:
file_buffer = numpy.asarray(bytearray(page.make_blob("JPEG")),
dtype=numpy.uint8)
ocr_process(file_buffer)
so I was wondering if I could write the output to a file like object
不要假设来自不同库的 python "image" 对象(或带下划线的 C 结构)可以相互比较。
在不了解 OCR api 的情况下,我无法帮助您通过 wand 部分,但我可以建议以下其中一项...
使用临时中间文件。 (较慢 I/O,但更容易 learn/develop/debug)
with Image(filename=INPUT_PATH) as img:
# work
img.save(filename=OUTPUT_PATH)
# OCR work on OUTPUT_PATH
如果 OCR API 支持,则使用文件描述符。 (同上)
with open(INPUT_PATH, 'rb') as fd:
with Image(file=fd) as img:
# work
# OCR work ???
使用 blob。 (更快 I/O 但需要 很多 内存)
buffer = None
with Image(filename=INPUT_PATH) as img:
# work
buffer = img.make_blob(FORMAT)
if buffer:
# OCR work ???
更多更新
将所有评论汇总在一起,解决方案可能是...
from wand.image import Image
from wand.api import library
import ctypes
import requests
# Map C-API not provided by wand library.
library.MagickAppendImages.argtypes = [ctypes.c_void_p, ctypes.c_int]
library.MagickAppendImages.restype = ctypes.c_void_p
with Image(filename='path_to_document.pdf', resolution=400) as image:
# ... Do pre-processing ...
# Reset the stack iterator.
library.MagickResetIterator(image.wand)
# Call C-API Append method.
resource_pointer = library.MagickAppendImages(image.wand, True)
# Convert to JPEG.
library.MagickSetImageFormat(resource_pointer, b'JPEG')
# Create size sentinel.
length = ctypes.c_size_t()
# Write image blob to memory.
image_data_pointer = library.MagickGetImagesBlob(resource_pointer,
ctypes.byref(length))
# Ensure success
if image_data_pointer and length.value:
# Create buffer from memory address
payload = ctypes.string_at(image_data_pointer, length.value)
# Define local filename.
payload_filename = 'my_hires_image.jpg'
# Post payload as multipart encoded image file with filename.
requests.post(THE_URL, files={'file': (payload_filename, payload)})
我正在尝试使用 imagemagick 将一些 PDF 转换为高分辨率 jpeg。我正在使用 python 3.62 - 64 位和 wand 0.4.4 开发 win 10、64。在命令行我有:
$ /e/ImageMagick-6.9.9-Q16-HDRI/convert.exe -density 400 myfile.pdf -scale 2000x1000 test3.jpg.
这对我来说效果很好。
在python中:
from wand.image import Image
file_path = os.path.dirname(os.path.abspath(__file__))+os.sep+"myfile.pdf"
with Image(filename=file_path, resolution=400) as image:
image.save()
image_jpeg = image.convert('jpeg')
这是给我低分辨率的 JPEG。我如何将其翻译成我的魔杖代码来做同样的事情?
编辑:
我意识到问题是输入的 pdf 必须作为二进制字符串读入 Image 对象,所以基于 http://docs.wand-py.org/en/0.4.4/guide/read.html#read-blob 我试过:
with open(file_path,'rb') as f:
image_binary = f.read()
f.close()
with Image(blob=image_binary,resolution=400) as img:
img.transform('2000x1000', '100%')
img.make_blob('jpeg')
img.save(filename='out.jpg')
这读取文件没问题,但输出被拆分成 10 个文件。为什么?我需要将其转换为 1 个高分辨率 jpeg。
编辑:
我需要将 jpeg 发送到 OCR api,所以我想知道是否可以将输出写入对象之类的文件。看着https://www.imagemagick.org/api/magick-image.php#MagickWriteImageFile,我试过了:
emptyFile = Image(width=1500, height=2000)
with Image(filename=file_path, resolution=400) as image:
library.MagickResetIterator(image.wand)
# Call C-API Append method.
resource_pointer = library.MagickAppendImages(image.wand,
True)
library.MagickWriteImagesFile(resource_pointer,emptyFile)
这给出:
File "E:/ENVS/r3/pdfminer.six/ocr_space.py", line 113, in <module>
test_file = ocr_stream(filename='test4.jpg')
File "E:/ENVS/r3/pdfminer.six/ocr_space.py", line 96, in ocr_stream
library.MagickWriteImagesFile(resource_pointer,emptyFile)
ctypes.ArgumentError: argument 2: <class 'TypeError'>: wrong type
我怎样才能让它工作?
怎么样:
ok = Image(filename=file_path, resolution=400)
with ok.transform('2000x1000', '100%') as image:
image.compression_quality = 100
image.save()
或:
with ok.resize(2000, 1000)
相关:
- https://github.com/dahlia/wand/blob/13c4f544bd271fe298ac8dde44fbf178b349361a/docs/guide/resizecrop.rst
- Python 3 Wand How to make an unanimated gif from multiple PDF pages
Why? I need to get this into 1 high res jpeg.
PDF 包含 ImageMagick 在 "stack" 中考虑单个图像的页面。 wand 库提供了一个 wand.image.Image.sequance
来处理每个页面。
但是,要将 所有 图像附加到单个 JPEG 中。您可以遍历每个页面并将它们拼接在一起,或者调用 C-API 的方法 MagickAppendImages
.
from wand.image import Image
from wand.api import library
import ctypes
# Map C-API not provided by wand library.
library.MagickAppendImages.argtypes = [ctypes.c_void_p, ctypes.c_int]
library.MagickAppendImages.restype = ctypes.c_void_p
with Image(filename="path_to_document.pdf", resolution=400) as image:
# Do all your preprocessing first
# Ether word directly on the wand instance, or iterate over each page.
# ...
# To write all "pages" into a single image.
# Reset the stack iterator.
library.MagickResetIterator(image.wand)
# Call C-API Append method.
resource_pointer = library.MagickAppendImages(image.wand,
True)
# Write C resource directly to disk.
library.MagickWriteImages(resource_pointer,
"output.jpeg".encode("ASCII"),
False)
更新:
I need to send the jpeg to an OCR api ...
假设您使用 OpenCV 的 python API,您只需要遍历每一页,并通过 numpy 缓冲区将图像文件数据传递给 OCR。
from wand.image import Image
import numpy
import cv2
def ocr_process(file_data_buffer):
""" Replace with whatever your OCR-API calls for """
mat_instance = cv2.imdecode(file_data_buffer)
# ... work ...
source_image="path_to_document.pdf"
with Image(filename=source_image, resolution=400) as img:
for page in img.sequence:
file_buffer = numpy.asarray(bytearray(page.make_blob("JPEG")),
dtype=numpy.uint8)
ocr_process(file_buffer)
so I was wondering if I could write the output to a file like object
不要假设来自不同库的 python "image" 对象(或带下划线的 C 结构)可以相互比较。
在不了解 OCR api 的情况下,我无法帮助您通过 wand 部分,但我可以建议以下其中一项...
使用临时中间文件。 (较慢 I/O,但更容易 learn/develop/debug)
with Image(filename=INPUT_PATH) as img: # work img.save(filename=OUTPUT_PATH) # OCR work on OUTPUT_PATH
如果 OCR API 支持,则使用文件描述符。 (同上)
with open(INPUT_PATH, 'rb') as fd: with Image(file=fd) as img: # work # OCR work ???
使用 blob。 (更快 I/O 但需要 很多 内存)
buffer = None with Image(filename=INPUT_PATH) as img: # work buffer = img.make_blob(FORMAT) if buffer: # OCR work ???
更多更新
将所有评论汇总在一起,解决方案可能是...
from wand.image import Image
from wand.api import library
import ctypes
import requests
# Map C-API not provided by wand library.
library.MagickAppendImages.argtypes = [ctypes.c_void_p, ctypes.c_int]
library.MagickAppendImages.restype = ctypes.c_void_p
with Image(filename='path_to_document.pdf', resolution=400) as image:
# ... Do pre-processing ...
# Reset the stack iterator.
library.MagickResetIterator(image.wand)
# Call C-API Append method.
resource_pointer = library.MagickAppendImages(image.wand, True)
# Convert to JPEG.
library.MagickSetImageFormat(resource_pointer, b'JPEG')
# Create size sentinel.
length = ctypes.c_size_t()
# Write image blob to memory.
image_data_pointer = library.MagickGetImagesBlob(resource_pointer,
ctypes.byref(length))
# Ensure success
if image_data_pointer and length.value:
# Create buffer from memory address
payload = ctypes.string_at(image_data_pointer, length.value)
# Define local filename.
payload_filename = 'my_hires_image.jpg'
# Post payload as multipart encoded image file with filename.
requests.post(THE_URL, files={'file': (payload_filename, payload)})