在 Python 中使用 PyPdf2 PdfFileMerger 时出错

Error occurred while using PyPdf2 PdfFileMerger in Python

我一直在使用 PyPdf2 创建一个 Python 程序来合并多个 pdf 文件。

这里是代码

import os
from PyPDF2 import PdfFileMerger

source_dir = os.getcwd()

merger = PdfFileMerger()

for item in os.listdir(source_dir):
    if item.endswith('pdf'):
        merger.append(item)

merger.write('completed_file.pdf')
merger.close()

while 运行 我遇到以下错误的代码:-

Traceback (most recent call last):
  File "F:\Python folder\Pdf_Merger\main.py", line 10, in <module>
    merger.append(item)
  File "F:\Python folder\Pdf_Merger\venv\lib\site-packages\PyPDF2\merger.py", line 203, in append
    self.merge(len(self.pages), fileobj, bookmark, pages, import_bookmarks)
  File "F:\Python folder\Pdf_Merger\venv\lib\site-packages\PyPDF2\merger.py", line 151, in merge
    outline = pdfr.getOutlines()
  File "F:\Python folder\Pdf_Merger\venv\lib\site-packages\PyPDF2\pdf.py", line 1362, in getOutlines
    outline = self._buildOutline(node)
  File "F:\Python folder\Pdf_Merger\venv\lib\site-packages\PyPDF2\pdf.py", line 1444, in _buildOutline
    outline = self._buildDestination(title, dest)
  File "F:\Python folder\Pdf_Merger\venv\lib\site-packages\PyPDF2\pdf.py", line 1425, in _buildDestination
    return Destination(title, page, typ, *array)
  File "F:\Python folder\Pdf_Merger\venv\lib\site-packages\PyPDF2\generic.py", line 1065, in __init__
    raise utils.PdfReadError("Unknown Destination Type: %r" % typ)
PyPDF2.utils.PdfReadError: Unknown Destination Type: 0

Process finished with exit code 1

注意- 我确保pdf文件的none受密码保护。

这似乎是由于您尝试合并的其中一个 PDF 大纲中的错误目标语法造成的。

如果您不关心大纲,您应该可以通过将 import_bookmarks kwarg 更新为 PdfFileMerger.append 中的 False 来解决这个问题,如下所示:

import os
from PyPDF2 import PdfFileMerger

source_dir = os.getcwd()

merger = PdfFileMerger()

for item in os.listdir(source_dir):
    if item.endswith('pdf'):
        merger.append(item, import_bookmarks=False)

merger.write('completed_file.pdf')
merger.close()

更多详情

PdfFileMerger.append 调用 PdfFileMerger.merge 并将 import_bookmarks kwarg 传递给它。默认设置为 True.

PyPDF2.generic 中,Destination class 在初始化期间引发此错误。合并试图通过从原始大纲中读取目的地来将它们构建到新大纲中。

def __init__(self, title, page, typ, *args):
    DictionaryObject.__init__(self)
    self[NameObject("/Title")] = title
    self[NameObject("/Page")] = page
    self[NameObject("/Type")] = typ

    # from table 8.2 of the PDF 1.7 reference.
    if typ == "/XYZ":
        (self[NameObject("/Left")], self[NameObject("/Top")],
            self[NameObject("/Zoom")]) = args
    elif typ == "/FitR":
        (self[NameObject("/Left")], self[NameObject("/Bottom")],
            self[NameObject("/Right")], self[NameObject("/Top")]) = args
    elif typ in ["/FitH", "/FitBH"]:
        self[NameObject("/Top")], = args
    elif typ in ["/FitV", "/FitBV"]:
        self[NameObject("/Left")], = args
    elif typ in ["/Fit", "/FitB"]:
        pass
    else:
        raise utils.PdfReadError("Unknown Destination Type: %r" % typ)

根据 PDF Reference 1.7,由于目标类型“0”不是有效类型,因此会引发错误。