如何使用 PyPDF2 设置 PDF 版本？

Question

我正在使用 PyPDF2 1.4 和 Python 2.7:

如何更改文件的 PDF 版本？

我试过的

my_input_filename.pdf 是 PDF 版本 1.5，但 _my_output_filename.pdf 是 1.3 PDF，我想在输出中保留 1.5：

from PyPDF2 import PdfFileWriter, PdfFileReader
from PyPDF2.generic import NameObject, createStringObject

input_filename = 'my_input_filename.pdf'

# Read input PDF file
inputPDF = PdfFileReader(open(input_filename, 'rb'))
info = inputPDF.documentInfo

for i in xrange(inputPDF.numPages):
    # Create output PDF
    outputPDF = PdfFileWriter()
    # Create dictionary for output PDF
    infoDict = outputPDF._info.getObject()
    # Update output PDF metadata with input PDF metadata
    for key in info:
        infoDict.update({NameObject(key): createStringObject(info[key])})
    outputPDF.addPage(inputPDF.getPage(i))

with open(output_filename , 'wb') as outputStream:
    outputPDF.write(outputStream)

Answer 1

当前版本的 PyPDF2不能生成除 PDF1.3 文件以外的任何文件header；来自 the official source code ： class PdfFileWriter(object):

    """
    This class supports writing PDF files out, given pages produced by another
    class (typically :class:`PdfFileReader<PdfFileReader>`).
    """
    def __init__(self):
        self._header = b_("%PDF-1.3")
        ...

如果那是合法的，考虑到它使您能够输入 >1.3 种东西，则值得怀疑。

如果你只想修复 header 中的版本字符串（我不知道这会产生什么后果，所以我假设你比我更了解 PDF 标准！）

from PyPDF2.utils import b_
...
outputPDF._header.replace(b_("PDF-1.3"),b_("PDF-1.5"))

或类似的东西。

Answer 2

将添加到上面 Marcus 的回答中：

（目前 - 我不能说什么时候 Marcus 写了他的 post）没有什么能阻止你使用标准的 PyPDF2 addMetadata 函数在元数据中指定版本。下面的示例使用 PdfFileMerger（因为我最近正在对现有文件上的 PDF 元数据进行一些清理），但 PdfFileWriter 具有相同的功能：

from PyPDF2 import PdfFileMerger

# Define file input/output, and metadata containing version string.
# Using separate input/output files, since it's worth keeping a copy of the originals!
fileIn = 'foo.pdf'
fileOut = 'bar.pdf'
metadata = {
    u'/Version': 'PDF-1.5'
}

# Set up PDF file merger, copy existing file contents into merger object.
merger = PdfFileMerger()

with open( fileIn, 'rb') as fh_in:
    merger.append(fh_in)

# Append metadata to PDF content in merger.
merger.addMetadata(metadata)

# Write new PDF file with appended metadata to output
# CAUTION: This will overwrite any existing files without prompt!
with open( fileOut, 'wb' ) as fh_out:
    merger.write(fh_out)

如何使用 PyPDF2 设置 PDF 版本？

How can I set the PDF version with PyPDF2?

python

pypdf

我试过的