使用 PyPDF2 更新可填写的 pdf

Update a fillable pdf using PyPDF2

我在更新可填写的 pdf 中的命名字段时遇到问题。 我的代码如图:

from PyPDF2 import PdfFileReader, PdfFileWriter

reader = PdfFileReader("invoice_template.pdf")
page = reader.getPage(0)

data_dict = {
    "business_name_1": "Consulting",
    "customer_name": "company.io",
    "customer_email": "example@icloud.com",
}

writer = PdfFileWriter()
writer.updatePageFormFieldValues(page, fields=data_dict)
writer.addPage(page)

with open("newfile.pdf", "wb") as fh:
    writer.write(fh)

我在调用 updatePageFormFieldValues() 之前和之后使用 myfile.getFormTextFields() 检查了字段字典,它们确实得到了更新。但是,生成的 pdf 中有 none 个字段值。不知道我做错了什么。我正在使用的 pdf 可以找到 here

问题已通过将 PDF 的 NeedAppearances 值设置为 True 解决。这可以通过一个函数来完成:

def set_need_appearances_writer(writer: PdfFileWriter):
    # See 12.7.2 and 7.7.2 for more information: http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
    try:
        catalog = writer._root_object
        # get the AcroForm tree
        if "/AcroForm" not in catalog:
            writer._root_object.update({
                NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)
            })

        need_appearances = NameObject("/NeedAppearances")
        writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)
        # del writer._root_object["/AcroForm"]['NeedAppearances']
        return writer

    except Exception as e:
        print('set_need_appearances_writer() catch : ', repr(e))
        return writer

然后,您只需在 writer = PdfFileWriter() 行之后添加 set_need_appearances_writer(writer) 行,表格就会更新!

您可以在此处查看更多信息:https://github.com/mstamy2/PyPDF2/issues/355

固定码

from PyPDF2 import PdfFileWriter, PdfFileReader
from PyPDF2.generic import BooleanObject, NameObject, IndirectObject

def set_need_appearances_writer(writer: PdfFileWriter):
    # See 12.7.2 and 7.7.2 for more information: http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
    try:
        catalog = writer._root_object
        # get the AcroForm tree
        if "/AcroForm" not in catalog:
            writer._root_object.update({
                NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)
            })

        need_appearances = NameObject("/NeedAppearances")
        writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)
        # del writer._root_object["/AcroForm"]['NeedAppearances']
        return writer

    except Exception as e:
        print('set_need_appearances_writer() catch : ', repr(e))
        return writer

myfile = PdfFileReader("invoice_template.pdf")
first_page = myfile.getPage(0)

writer = PdfFileWriter()
set_need_appearances_writer(writer)

data_dict = {
            'business_name_1': 'Consulting',
            'customer_name': 'company.io',
            'customer_email': 'example@icloud.com'
            }

writer.updatePageFormFieldValues(first_page, fields=data_dict)
writer.addPage(first_page)

with open("newfile.pdf","wb") as new:
    writer.write(new)