是否可以将 pdf 字节直接输入 PyPDF2 而不是先制作 PDF 文件
Is it possible to input pdf bytes straight into PyPDF2 instead of making a PDF file first
我正在使用Linux;将原始打印到端口 9100 returns 一个 "bytes" 类型。我想知道是否可以从这里直接进入 PyPDF2,而不是先制作一个 pdf 文件并使用 PdfFileReader 方法?
感谢您的宝贵时间。
PyPDF2.PdfFileReader()
将其第一个参数定义为:
stream – A File object or an object that supports the standard read and seek methods similar to a File object. Could also be a string representing a path to a PDF file.
因此您可以将任何数据传递给它,只要它可以作为类似文件的流进行访问即可。一个完美的候选者是 io.BytesIO()
。将您收到的 原始字节 写入其中,然后返回 0
,将对象传递给 PyPDF2.PdfFileReader()
,您就完成了。
嗯,先评论吧。这是在不创建 pdf 文件的情况下生成 pdf 字节的代码示例:
import io
from typing import List
from PyPDF2 import PdfFileReader, PdfFileWriter
def join_pdf(pdf_chunks: List[bytes]) -> bytes:
# Create empty pdf-writer object for adding all pages here
result_pdf = PdfFileWriter()
# Iterate for all pdf-bytes
for chunk in pdf_chunks:
# Read bytes
chunk_pdf = PdfFileReader(
stream=io.BytesIO( # Create steam object
initial_bytes=chunk
)
)
# Add all pages to our result
for page in range(chunk_pdf.getNumPages()):
result_pdf.addPage(chunk_pdf.getPage(page))
# Writes all bytes to bytes-stream
response_bytes_stream = io.BytesIO()
result_pdf.write(response_bytes_stream)
return response_bytes_stream.getvalue()
几年后,我将此添加到 the PyPDF2 docs:
from io import BytesIO
# Prepare example
with open("example.pdf", "rb") as fh:
bytes_stream = BytesIO(fh.read())
# Read from bytes_stream
reader = PdfFileReader(bytes_stream)
# Write to bytes_stream
writer = PdfFileWriter()
with BytesIO() as bytes_stream:
writer.write(bytes_stream)
我正在使用Linux;将原始打印到端口 9100 returns 一个 "bytes" 类型。我想知道是否可以从这里直接进入 PyPDF2,而不是先制作一个 pdf 文件并使用 PdfFileReader 方法?
感谢您的宝贵时间。
PyPDF2.PdfFileReader()
将其第一个参数定义为:
stream – A File object or an object that supports the standard read and seek methods similar to a File object. Could also be a string representing a path to a PDF file.
因此您可以将任何数据传递给它,只要它可以作为类似文件的流进行访问即可。一个完美的候选者是 io.BytesIO()
。将您收到的 原始字节 写入其中,然后返回 0
,将对象传递给 PyPDF2.PdfFileReader()
,您就完成了。
嗯,先评论吧。这是在不创建 pdf 文件的情况下生成 pdf 字节的代码示例:
import io
from typing import List
from PyPDF2 import PdfFileReader, PdfFileWriter
def join_pdf(pdf_chunks: List[bytes]) -> bytes:
# Create empty pdf-writer object for adding all pages here
result_pdf = PdfFileWriter()
# Iterate for all pdf-bytes
for chunk in pdf_chunks:
# Read bytes
chunk_pdf = PdfFileReader(
stream=io.BytesIO( # Create steam object
initial_bytes=chunk
)
)
# Add all pages to our result
for page in range(chunk_pdf.getNumPages()):
result_pdf.addPage(chunk_pdf.getPage(page))
# Writes all bytes to bytes-stream
response_bytes_stream = io.BytesIO()
result_pdf.write(response_bytes_stream)
return response_bytes_stream.getvalue()
几年后,我将此添加到 the PyPDF2 docs:
from io import BytesIO
# Prepare example
with open("example.pdf", "rb") as fh:
bytes_stream = BytesIO(fh.read())
# Read from bytes_stream
reader = PdfFileReader(bytes_stream)
# Write to bytes_stream
writer = PdfFileWriter()
with BytesIO() as bytes_stream:
writer.write(bytes_stream)