"ERROR - 'NoneType' object has no attribute 'axes'" 尝试从 s3 字节对象读取 pickle 文件时
"ERROR - 'NoneType' object has no attribute 'axes'" when trying to read pickle file from s3 bytes object
我运行在 Apache 气流环境中使用以下代码从 s3 获取 pickle 文件并将其读入内存。我一尝试 read/print 文件内容,就收到错误消息:
ERROR - 'NoneType' object has no attribute 'axes'
代码
import boto3
import pickle
# [...Omitted code...]
s3_session = boto3.Session(
aws_access_key_id=access_key,
aws_secret_access_key=secret_key
)
s3 = s3_session.resource('s3')
obj = s3.Object(bucket_name, KEY)
pickle_contents = obj.get()['Body'].read()
body = pickle.loads(pickle_contents)
print(body)
# ^-- This is where the error happens, as soon as I try to read it.
这段代码实际上似乎在单独的 Jupyter notebook 实例上运行良好,这导致我猜测版本不兼容问题?泡菜文件看起来像下面的字典,感谢我的 Jupyter notebook 让我 print(body)
:
泡菜文件正文:
{75:
'recommendation_diversity_metrics':
{'largest_subcategory_group_proportion':
{'mean': 0.3369472,
'sd': 0.1741708739837092,
'min': 0.05333333333333334,
'max': 1.0},
'catalogue_entropy': 3.4412171579585533,
'subcategory_overweight_frequency':
School & Office Supplies 0.73020
Pants 0.70656
Bedding 0.64138
Sweaters 0.62616
Tops 0.57044
...
Cleanup & Odor Control 0.00144
UNKNOWN 0.00036
Body Piercings 0.00034
Misc Books 0.00012
Home Books 0.00012
Length: 94, dtype: float64},
'recommendation_novelty_metrics': {
'previously_interacted': {'mean': 0.052456533333333326,
'sd': 0.06291214458333363,
'min': 0.0,
'max': 0.6},
'new_product_frequency': {'mean': 0.016672799999999998,
'sd': 0.01423356021834222,
'min': 0.0,
'max': 0.12}
}}
我认为发生错误是因为我在字典中有一个 pandas 系列对象(参见上面字典中的 subcategory_overweight_frequency
)。 因为只要我只读取除那个特定元素之外的所有字典元素,那么解释器就会让我的代码 运行 正常。我是否遗漏了一个我不知道的依赖项?
完整追溯
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.7/site-packages/pandas/core/frame.py", line 655, in __repr__
show_dimensions=show_dimensions,
File "/usr/local/lib/python3.7/site-packages/pandas/core/frame.py", line 774, in to_string
line_width=line_width,
File "/usr/local/lib/python3.7/site-packages/pandas/io/formats/format.py", line 484, in __init__
self.max_rows_displayed = min(max_rows or len(self.frame), len(self.frame))
File "/usr/local/lib/python3.7/site-packages/pandas/core/frame.py", line 996, in __len__
return len(self.index)
File "/usr/local/lib/python3.7/site-packages/pandas/core/generic.py", line 5175, in __getattr__
return object.__getattribute__(self, name)
File "pandas/_libs/properties.pyx", line 63, in pandas._libs.properties.AxisProperty.__get__
AttributeError: 'NoneType' object has no attribute 'axes'
您可能已经使用更高版本的 Pandas 腌制了 DataFrame,并且可能正在尝试使用早期版本读取腌制文件。
请验证您用来 pickle DataFrame 的版本以及您与 Airflow 一起使用的 Pandas 版本。
我运行在 Apache 气流环境中使用以下代码从 s3 获取 pickle 文件并将其读入内存。我一尝试 read/print 文件内容,就收到错误消息:
ERROR - 'NoneType' object has no attribute 'axes'
代码
import boto3
import pickle
# [...Omitted code...]
s3_session = boto3.Session(
aws_access_key_id=access_key,
aws_secret_access_key=secret_key
)
s3 = s3_session.resource('s3')
obj = s3.Object(bucket_name, KEY)
pickle_contents = obj.get()['Body'].read()
body = pickle.loads(pickle_contents)
print(body)
# ^-- This is where the error happens, as soon as I try to read it.
这段代码实际上似乎在单独的 Jupyter notebook 实例上运行良好,这导致我猜测版本不兼容问题?泡菜文件看起来像下面的字典,感谢我的 Jupyter notebook 让我 print(body)
:
泡菜文件正文:
{75:
'recommendation_diversity_metrics':
{'largest_subcategory_group_proportion':
{'mean': 0.3369472,
'sd': 0.1741708739837092,
'min': 0.05333333333333334,
'max': 1.0},
'catalogue_entropy': 3.4412171579585533,
'subcategory_overweight_frequency':
School & Office Supplies 0.73020
Pants 0.70656
Bedding 0.64138
Sweaters 0.62616
Tops 0.57044
...
Cleanup & Odor Control 0.00144
UNKNOWN 0.00036
Body Piercings 0.00034
Misc Books 0.00012
Home Books 0.00012
Length: 94, dtype: float64},
'recommendation_novelty_metrics': {
'previously_interacted': {'mean': 0.052456533333333326,
'sd': 0.06291214458333363,
'min': 0.0,
'max': 0.6},
'new_product_frequency': {'mean': 0.016672799999999998,
'sd': 0.01423356021834222,
'min': 0.0,
'max': 0.12}
}}
我认为发生错误是因为我在字典中有一个 pandas 系列对象(参见上面字典中的 subcategory_overweight_frequency
)。 因为只要我只读取除那个特定元素之外的所有字典元素,那么解释器就会让我的代码 运行 正常。我是否遗漏了一个我不知道的依赖项?
完整追溯
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.7/site-packages/pandas/core/frame.py", line 655, in __repr__
show_dimensions=show_dimensions,
File "/usr/local/lib/python3.7/site-packages/pandas/core/frame.py", line 774, in to_string
line_width=line_width,
File "/usr/local/lib/python3.7/site-packages/pandas/io/formats/format.py", line 484, in __init__
self.max_rows_displayed = min(max_rows or len(self.frame), len(self.frame))
File "/usr/local/lib/python3.7/site-packages/pandas/core/frame.py", line 996, in __len__
return len(self.index)
File "/usr/local/lib/python3.7/site-packages/pandas/core/generic.py", line 5175, in __getattr__
return object.__getattribute__(self, name)
File "pandas/_libs/properties.pyx", line 63, in pandas._libs.properties.AxisProperty.__get__
AttributeError: 'NoneType' object has no attribute 'axes'
您可能已经使用更高版本的 Pandas 腌制了 DataFrame,并且可能正在尝试使用早期版本读取腌制文件。
请验证您用来 pickle DataFrame 的版本以及您与 Airflow 一起使用的 Pandas 版本。