Lambda 通过列表中的元素循环函数
Lambda to loop function through elements in list
如何调整此代码以使函数循环遍历列表 models_2
?如果我有使用 models
的功能,它会工作,如果我更改为“models_2”,它会给我这个错误:
AttributeError: 'float' object has no attribute 'seek'
这是我的数据框,来自 excel,所有单元格格式都设置为“文本”。
MOD1 MOD2 MOD3 MOD4
0 File1.pdf File3.pdf File1.pdf File3.pdf
1 File2.pdf NaN File2.pdf File3.pdf
2 File3.pdf NaN NaN NaN
models = ['MOD1']
models_2 = ['MOD1', 'MOD2']
def merge_pdf(models):
merger = PdfFileMerger()
for name in models:
for index, row in df.iterrows():
merger.append(row[name])
merger.write(f"Order #XXXXXXX ({name}) Production Package - Rev.0.pdf")
merger.close()
merge_pdf(models)
完整的错误信息:
PdfReadWarning: Xref table not zero-indexed. ID numbers for objects will be corrected. [_reader.py:1065]
Traceback (most recent call last):
File "Z:\PyCharm\Excel_Reader\Excel_Reader.py", line 30, in <module>
merge_pdf(models)
File "Z:\PyCharm\Excel_Reader\Excel_Reader.py", line 27, in merge_pdf
merger.append(row[name])
File "C:\Users\x\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\merger.py", line 227, in append
self.merge(len(self.pages), fileobj, bookmark, pages, import_bookmarks)
File "C:\Users\x\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\merger.py", line 149, in merge
pdfr = PdfFileReader(
File "C:\Users\x\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\_reader.py", line 239, in __init__
self.read(stream)
File "C:\Users\x\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\_reader.py", line 911, in read
stream.seek(-1, 2)
AttributeError: 'float' object has no attribute 'seek'
您的代码失败,因为列 'MOD2' 包含 NaN
个值,其类型为 float
。你处理这个问题的方式取决于你想用这些 NaN
值做什么。
您可以通过运行以下代码验证:
import pandas as pd
import numpy as np
data = {
'MOD1':['File1.pdf', 'File2.pdf', 'File3.pdf'],
'MOD2':['File1.pdf', np.nan, np.nan],
'MOD3':['File1.pdf', 'File2.pdf', np.nan],
'MOD4':['File1.pdf', 'File2.pdf', np.nan]
}
df = pd.DataFrame(data)
models = ['MOD1']
models_2 = ['MOD1', 'MOD2']
merger = []
for name in models_2:
for index, row in df.iterrows():
print(name, index, row[name], type(row[name]))
这将打印以下内容:
MOD1 0 File1.pdf <class 'str'>
MOD1 1 File2.pdf <class 'str'>
MOD1 2 File3.pdf <class 'str'>
MOD2 0 File1.pdf <class 'str'>
MOD2 1 nan <class 'float'>
MOD2 2 nan <class 'float'>
如果您知道只想包含具有字符串值的单元格,则可以在将其附加到 merger
对象之前添加类型检查,如下所示:
models = ['MOD1']
models_2 = ['MOD1', 'MOD2']
def merge_pdf(models):
merger = PdfFileMerger()
for name in models:
for index, row in df.iterrows():
if type(row[name]) == str:
merger.append(row[name])
merger.write(f"Order #XXXXXXX ({name}) Production Package - Rev.0.pdf")
merger.close()
merge_pdf(models)
如何调整此代码以使函数循环遍历列表 models_2
?如果我有使用 models
的功能,它会工作,如果我更改为“models_2”,它会给我这个错误:
AttributeError: 'float' object has no attribute 'seek'
这是我的数据框,来自 excel,所有单元格格式都设置为“文本”。
MOD1 MOD2 MOD3 MOD4
0 File1.pdf File3.pdf File1.pdf File3.pdf
1 File2.pdf NaN File2.pdf File3.pdf
2 File3.pdf NaN NaN NaN
models = ['MOD1']
models_2 = ['MOD1', 'MOD2']
def merge_pdf(models):
merger = PdfFileMerger()
for name in models:
for index, row in df.iterrows():
merger.append(row[name])
merger.write(f"Order #XXXXXXX ({name}) Production Package - Rev.0.pdf")
merger.close()
merge_pdf(models)
完整的错误信息:
PdfReadWarning: Xref table not zero-indexed. ID numbers for objects will be corrected. [_reader.py:1065]
Traceback (most recent call last):
File "Z:\PyCharm\Excel_Reader\Excel_Reader.py", line 30, in <module>
merge_pdf(models)
File "Z:\PyCharm\Excel_Reader\Excel_Reader.py", line 27, in merge_pdf
merger.append(row[name])
File "C:\Users\x\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\merger.py", line 227, in append
self.merge(len(self.pages), fileobj, bookmark, pages, import_bookmarks)
File "C:\Users\x\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\merger.py", line 149, in merge
pdfr = PdfFileReader(
File "C:\Users\x\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\_reader.py", line 239, in __init__
self.read(stream)
File "C:\Users\x\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\_reader.py", line 911, in read
stream.seek(-1, 2)
AttributeError: 'float' object has no attribute 'seek'
您的代码失败,因为列 'MOD2' 包含 NaN
个值,其类型为 float
。你处理这个问题的方式取决于你想用这些 NaN
值做什么。
您可以通过运行以下代码验证:
import pandas as pd
import numpy as np
data = {
'MOD1':['File1.pdf', 'File2.pdf', 'File3.pdf'],
'MOD2':['File1.pdf', np.nan, np.nan],
'MOD3':['File1.pdf', 'File2.pdf', np.nan],
'MOD4':['File1.pdf', 'File2.pdf', np.nan]
}
df = pd.DataFrame(data)
models = ['MOD1']
models_2 = ['MOD1', 'MOD2']
merger = []
for name in models_2:
for index, row in df.iterrows():
print(name, index, row[name], type(row[name]))
这将打印以下内容:
MOD1 0 File1.pdf <class 'str'>
MOD1 1 File2.pdf <class 'str'>
MOD1 2 File3.pdf <class 'str'>
MOD2 0 File1.pdf <class 'str'>
MOD2 1 nan <class 'float'>
MOD2 2 nan <class 'float'>
如果您知道只想包含具有字符串值的单元格,则可以在将其附加到 merger
对象之前添加类型检查,如下所示:
models = ['MOD1']
models_2 = ['MOD1', 'MOD2']
def merge_pdf(models):
merger = PdfFileMerger()
for name in models:
for index, row in df.iterrows():
if type(row[name]) == str:
merger.append(row[name])
merger.write(f"Order #XXXXXXX ({name}) Production Package - Rev.0.pdf")
merger.close()
merge_pdf(models)