Lambda 通过列表中的元素循环函数

Lambda to loop function through elements in list

如何调整此代码以使函数循环遍历列表 models_2?如果我有使用 models 的功能,它会工作,如果我更改为“models_2”,它会给我这个错误:

AttributeError: 'float' object has no attribute 'seek'

这是我的数据框,来自 excel,所有单元格格式都设置为“文本”。

        MOD1       MOD2       MOD3       MOD4
0  File1.pdf  File3.pdf  File1.pdf  File3.pdf
1  File2.pdf        NaN  File2.pdf  File3.pdf
2  File3.pdf        NaN        NaN        NaN
models = ['MOD1']
models_2 = ['MOD1', 'MOD2']

def merge_pdf(models):
    merger = PdfFileMerger()
        for name in models:
            for index, row in df.iterrows():
                merger.append(row[name])
    merger.write(f"Order #XXXXXXX ({name}) Production Package - Rev.0.pdf")
    merger.close()

merge_pdf(models)

完整的错误信息:

PdfReadWarning: Xref table not zero-indexed. ID numbers for objects will be corrected. [_reader.py:1065]
Traceback (most recent call last):
  File "Z:\PyCharm\Excel_Reader\Excel_Reader.py", line 30, in <module>
    merge_pdf(models)
  File "Z:\PyCharm\Excel_Reader\Excel_Reader.py", line 27, in merge_pdf
    merger.append(row[name])
  File "C:\Users\x\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\merger.py", line 227, in append
    self.merge(len(self.pages), fileobj, bookmark, pages, import_bookmarks)
  File "C:\Users\x\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\merger.py", line 149, in merge
    pdfr = PdfFileReader(
  File "C:\Users\x\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\_reader.py", line 239, in __init__
    self.read(stream)
  File "C:\Users\x\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\_reader.py", line 911, in read
    stream.seek(-1, 2)
AttributeError: 'float' object has no attribute 'seek'

您的代码失败,因为列 'MOD2' 包含 NaN 个值,其类型为 float。你处理这个问题的方式取决于你想用这些 NaN 值做什么。

您可以通过运行以下代码验证:

import pandas as pd
import numpy as np

data = {
    'MOD1':['File1.pdf', 'File2.pdf', 'File3.pdf'],
    'MOD2':['File1.pdf', np.nan, np.nan],
    'MOD3':['File1.pdf', 'File2.pdf', np.nan],
    'MOD4':['File1.pdf', 'File2.pdf', np.nan]
}

df = pd.DataFrame(data)

models = ['MOD1']
models_2 = ['MOD1', 'MOD2']

merger = []

for name in models_2:
    for index, row in df.iterrows():
        print(name, index, row[name], type(row[name]))

这将打印以下内容:

MOD1 0 File1.pdf <class 'str'>
MOD1 1 File2.pdf <class 'str'>
MOD1 2 File3.pdf <class 'str'>
MOD2 0 File1.pdf <class 'str'>
MOD2 1 nan <class 'float'>
MOD2 2 nan <class 'float'>

如果您知道只想包含具有字符串值的单元格,则可以在将其附加到 merger 对象之前添加类型检查,如下所示:

models = ['MOD1']
models_2 = ['MOD1', 'MOD2']

def merge_pdf(models):
    merger = PdfFileMerger()
        for name in models:
            for index, row in df.iterrows():
                if type(row[name]) == str:
                    merger.append(row[name])
    merger.write(f"Order #XXXXXXX ({name}) Production Package - Rev.0.pdf")
    merger.close()

merge_pdf(models)