PyPDF2 IOError: [Errno 22] Invalid argument on PyPdfFileReader Python 2.7

Question

目标=打开文件、加密文件、写入加密文件。
尝试使用 PyPDF2 模块来完成此操作。我已经验证 "input" 是一个文件类型对象。我研究了这个错误，它转换为 "file not found"。我相信它以某种方式链接到 file/file 路径，但不确定如何调试或排除故障。并收到以下错误：

Traceback (most recent call last):
  File "CommissionSecurity.py", line 52, in <module>
    inputStream = PyPDF2.PdfFileReader(input)
  File "build\bdist.win-amd64\egg\PyPDF2\pdf.py", line 1065, in __init__
  File "build\bdist.win-amd64\egg\PyPDF2\pdf.py", line 1660, in read
IOError: [Errno 22] Invalid argument

下面是相关代码。我不确定如何解决此问题，因为我不确定问题出在哪里。任何指导表示赞赏。

for ID in FileDict:
        if ID in EmailDict : 
            path = "C:\Apps\CorVu\DATA\Reports\AlliD\Monthly Commission Reports\Output\pdcom1\"
            #print os.listdir(path)
            file = os.path.join(path + FileDict[ID])

            with open(file, 'rb') as input:
                print type(input)
                inputStream = PyPDF2.PdfFileReader(input)
                output = PyPDF2.PdfFileWriter()
                output = inputStream.encrypt(EmailDict[ID][1])
            with open(file, 'wb') as outputStream:
                output.write(outputStream)  
        else : continue

Answer 1

我认为您的问题可能是因为您使用相同的文件名来打开和写入文件，并打开了两次：

with open(file, 'rb') as input :
    with open(file, 'wb') as outputStream :

w 模式将截断文件，因此第二行截断输入。
我不确定你的意图是什么，因为你不能真正尝试从文件的（开头）读取，同时覆盖它。即使您尝试写入文件末尾，您也必须将文件指针定位在某处。因此，创建一个具有不同名称的额外输出文件；在两个文件都关闭后，您始终可以将该输出文件重命名为您的输入文件，从而覆盖您的输入文件。

或者您可以先将完整的文件读入内存，然后写入：

with open(file, 'rb') as input:
    inputStream = PyPDF2.PdfFileReader(input)
    output = PyPDF2.PdfFileWriter()
    output = input.encrypt(EmailDict[ID][1])
with open(file, 'wb') as outputStream:
    output.write(outputStream)

备注：

您分配 inputStream，但从不使用它
您将 PdfFileWriter() 分配给 output，然后在下一行中将其他内容分配给 output。因此，您从未使用过第一行 output = 的结果。

请仔细检查你在做什么，因为感觉你的代码还有很多其他问题。

或者，这里有一些其他可能有用的提示：

documentation 建议您也可以使用文件名作为 PdfFileReader 的第一个参数：

stream – A File object or an object that supports the standard read and seek methods similar to a File object. Could also be a string representing a path to a PDF file.

所以尝试：

inputStream = PyPDF2.PdfFileReader(file)

您也可以尝试将 strict 参数设置为 False:

strict (bool) – Determines whether user should be warned of all problems and also causes some correctable problems to be fatal. Defaults to True.

例如：

inputStream = PyPDF2.PdfFileReader(file, strict=False)

Answer 2

使用 open(file, 'rb') 是导致问题的原因，因为 PdfFileReader() 会自动执行此操作。我刚刚删除了 with 语句并纠正了问题。

with open(file, 'rb') as input:
    inputStream = PyPDF2.PdfFileReader(input)

Answer 3

出现此错误是因为 PDF 文件为空。我的 PDF 文件是空的，这就是我出现错误的原因。所以首先我用一些数据填充我的 PDF 文件然后开始使用 PyPDF2.PdfFileReader,

重新阅读它

它解决了我的问题！！！

Answer 4

晚了，但是，您可能打开了一个无效的 PDF 文件或一个名为 x.pdf 的空文件，并且您认为它是一个 PDF 文件

PyPDF2 IOError: [Errno 22] Invalid argument on PyPdfFileReader Python 2.7

PyPDF2 IOError: [Errno 22] Invalid argument on PyPdfFileReader Python 2.7

python

pypdf

python-2.7