确定为什么一个对象不能被腌制
Establishing why an object can't be pickled
我正在从 Object
类型的 api 接收对象 t
。我无法腌制它,出现错误:
File "p.py", line 55, in <module>
pickle.dump(t, open('data.pkl', 'wb'))
File "/usr/lib/python2.6/pickle.py", line 1362, in dump
Pickler(file, protocol).dump(obj)
File "/usr/lib/python2.6/pickle.py", line 224, in dump
self.save(obj)
File "/usr/lib/python2.6/pickle.py", line 313, in save
(t.__name__, obj))
pickle.PicklingError: Can't pickle 'Object' object: <Object object at 0xb77b11a0>
当我执行以下操作时:
for i in dir(t): print(type(i))
我只得到字符串对象:
<type 'str'>
<type 'str'>
<type 'str'>
...
<type 'str'>
<type 'str'>
<type 'str'>
如何打印我的 Object
对象的内容以了解为什么它不能被 pickle?
也有可能该对象包含指向 QT 对象的 C 指针,在这种情况下,我对该对象进行 pickle 是没有意义的。但我还是想看看对象的内部结构,以便确定这一点。
您可能需要阅读 python docs 并在之后检查您的 API 的 Object
class。
关于 "internal structure of the object",通常实例属性存储在 __dict__
属性中(并且由于 class 属性未被腌制,您只关心实例属性) - 但是请注意,您还必须递归检查每个属性的 __dict__
。
我会使用 dill
,它有工具可以调查对象内部的什么导致目标对象不可 picklable。有关示例,请参见此答案:Good example of BadItem in Dill Module, and this Q&A for an example of the detection tools in real use: pandas.algos._return_false causes PicklingError with dill.dump_session on CentOS.
>>> import dill
>>> x = iter([1,2,3,4])
>>> d = {'x':x}
>>> # we check for unpicklable items in d (i.e. the iterator x)
>>> dill.detect.baditems(d)
[<listiterator object at 0x10b0e48d0>]
>>> # note that nothing inside of the iterator is unpicklable!
>>> dill.detect.baditems(x)
[]
然而,最常见的起点是使用 trace
:
>>> dill.detect.trace(True)
>>> dill.detect.errors(d)
D2: <dict object at 0x10b8394b0>
T4: <type 'listiterator'>
PicklingError("Can't pickle <type 'listiterator'>: it's not found as __builtin__.listiterator",)
>>>
dill
还具有跟踪对象的指针引用和引用的功能,因此您可以构建对象如何相互引用的层次结构。参见:https://github.com/uqfoundation/dill/issues/58
此外,还有:cloudpickle.py和debugpickle.py,它们大部分已不再开发。我是 dill
的作者,希望尽快合并这些代码中 dill
.
中缺失的任何功能
我试过 Dill,但它没有解释我的问题。相反,我使用了 https://gist.github.com/andresriancho/15b5e226de68a0c2efd0 中的以下代码,这恰好在我的 __getattribute__
覆盖中显示了一个错误:
def debug_pickle(instance):
"""
:return: Which attribute from this object can't be pickled?
"""
attribute = None
for k, v in instance.__dict__.iteritems():
try:
cPickle.dumps(v)
except:
attribute = k
break
return attribute
编辑:这是我的代码的复制,使用 pickle 和 cPickle:
class myDict(dict):
def __getattribute__(self, item):
# Try to get attribute from internal dict
item = item.replace("_", "$")
if item in self:
return self[item]
# Try super, which may leads to an AttribueError
return super(myDict, self).__getattribute__(item)
myd = myDict()
try:
with open('test.pickle', 'wb') as myf:
cPickle.dump(myd, myf, protocol=-1)
except:
print traceback.format_exc()
try:
with open('test.pickle', 'wb') as myf:
pickle.dump(myd, myf, protocol=-1)
except:
print traceback.format_exc()
输出:
Traceback (most recent call last):
File "/Users/myuser/Documents/workspace/AcceptanceTesting/ingest.py", line 35, in <module>
cPickle.dump(myd, myf, protocol=-1)
UnpickleableError: Cannot pickle <class '__main__.myDict'> objects
Traceback (most recent call last):
File "/Users/myuser/Documents/workspace/AcceptanceTesting/ingest.py", line 42, in <module>
pickle.dump(myd, myf, protocol=-1)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1370, in dump
Pickler(file, protocol).dump(obj)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 313, in save
(t.__name__, obj))
PicklingError: Can't pickle 'myDict' object: {}
您会看到原因是因为属性名称被 __getattribute__
破坏了
这是 的扩展,在 Python 3.
它:
是递归的,用于处理问题可能有很多层次的复杂对象。
输出采用 .x[i].y.z....
形式,以便您查看调用了哪些成员来解决问题。使用 dict
它只是打印 [key/val type=...]
而不是,因为键或值都可能是问题所在,使得更难(但并非不可能)引用 dict
中的特定键或值。
考虑了更多的类型,特别是list
、tuple
和dict
,需要单独处理,因为它们没有[=18] =]属性。
returns 所有问题,而不仅仅是第一个问题。
def get_unpicklable(instance, exception=None, string='', first_only=True):
"""
Recursively go through all attributes of instance and return a list of whatever
can't be pickled.
Set first_only to only print the first problematic element in a list, tuple or
dict (otherwise there could be lots of duplication).
"""
problems = []
if isinstance(instance, tuple) or isinstance(instance, list):
for k, v in enumerate(instance):
try:
pickle.dumps(v)
except BaseException as e:
problems.extend(get_unpicklable(v, e, string + f'[{k}]'))
if first_only:
break
elif isinstance(instance, dict):
for k in instance:
try:
pickle.dumps(k)
except BaseException as e:
problems.extend(get_unpicklable(
k, e, string + f'[key type={type(k).__name__}]'
))
if first_only:
break
for v in instance.values():
try:
pickle.dumps(v)
except BaseException as e:
problems.extend(get_unpicklable(
v, e, string + f'[val type={type(v).__name__}]'
))
if first_only:
break
else:
for k, v in instance.__dict__.items():
try:
pickle.dumps(v)
except BaseException as e:
problems.extend(get_unpicklable(v, e, string + '.' + k))
# if we get here, it means pickling instance caused an exception (string is not
# empty), yet no member was a problem (problems is empty), thus instance itself
# is the problem.
if string != '' and not problems:
problems.append(
string + f" (Type '{type(instance).__name__}' caused: {exception})"
)
return problems
我正在从 Object
类型的 api 接收对象 t
。我无法腌制它,出现错误:
File "p.py", line 55, in <module>
pickle.dump(t, open('data.pkl', 'wb'))
File "/usr/lib/python2.6/pickle.py", line 1362, in dump
Pickler(file, protocol).dump(obj)
File "/usr/lib/python2.6/pickle.py", line 224, in dump
self.save(obj)
File "/usr/lib/python2.6/pickle.py", line 313, in save
(t.__name__, obj))
pickle.PicklingError: Can't pickle 'Object' object: <Object object at 0xb77b11a0>
当我执行以下操作时:
for i in dir(t): print(type(i))
我只得到字符串对象:
<type 'str'>
<type 'str'>
<type 'str'>
...
<type 'str'>
<type 'str'>
<type 'str'>
如何打印我的 Object
对象的内容以了解为什么它不能被 pickle?
也有可能该对象包含指向 QT 对象的 C 指针,在这种情况下,我对该对象进行 pickle 是没有意义的。但我还是想看看对象的内部结构,以便确定这一点。
您可能需要阅读 python docs 并在之后检查您的 API 的 Object
class。
关于 "internal structure of the object",通常实例属性存储在 __dict__
属性中(并且由于 class 属性未被腌制,您只关心实例属性) - 但是请注意,您还必须递归检查每个属性的 __dict__
。
我会使用 dill
,它有工具可以调查对象内部的什么导致目标对象不可 picklable。有关示例,请参见此答案:Good example of BadItem in Dill Module, and this Q&A for an example of the detection tools in real use: pandas.algos._return_false causes PicklingError with dill.dump_session on CentOS.
>>> import dill
>>> x = iter([1,2,3,4])
>>> d = {'x':x}
>>> # we check for unpicklable items in d (i.e. the iterator x)
>>> dill.detect.baditems(d)
[<listiterator object at 0x10b0e48d0>]
>>> # note that nothing inside of the iterator is unpicklable!
>>> dill.detect.baditems(x)
[]
然而,最常见的起点是使用 trace
:
>>> dill.detect.trace(True)
>>> dill.detect.errors(d)
D2: <dict object at 0x10b8394b0>
T4: <type 'listiterator'>
PicklingError("Can't pickle <type 'listiterator'>: it's not found as __builtin__.listiterator",)
>>>
dill
还具有跟踪对象的指针引用和引用的功能,因此您可以构建对象如何相互引用的层次结构。参见:https://github.com/uqfoundation/dill/issues/58
此外,还有:cloudpickle.py和debugpickle.py,它们大部分已不再开发。我是 dill
的作者,希望尽快合并这些代码中 dill
.
我试过 Dill,但它没有解释我的问题。相反,我使用了 https://gist.github.com/andresriancho/15b5e226de68a0c2efd0 中的以下代码,这恰好在我的 __getattribute__
覆盖中显示了一个错误:
def debug_pickle(instance):
"""
:return: Which attribute from this object can't be pickled?
"""
attribute = None
for k, v in instance.__dict__.iteritems():
try:
cPickle.dumps(v)
except:
attribute = k
break
return attribute
编辑:这是我的代码的复制,使用 pickle 和 cPickle:
class myDict(dict):
def __getattribute__(self, item):
# Try to get attribute from internal dict
item = item.replace("_", "$")
if item in self:
return self[item]
# Try super, which may leads to an AttribueError
return super(myDict, self).__getattribute__(item)
myd = myDict()
try:
with open('test.pickle', 'wb') as myf:
cPickle.dump(myd, myf, protocol=-1)
except:
print traceback.format_exc()
try:
with open('test.pickle', 'wb') as myf:
pickle.dump(myd, myf, protocol=-1)
except:
print traceback.format_exc()
输出:
Traceback (most recent call last):
File "/Users/myuser/Documents/workspace/AcceptanceTesting/ingest.py", line 35, in <module>
cPickle.dump(myd, myf, protocol=-1)
UnpickleableError: Cannot pickle <class '__main__.myDict'> objects
Traceback (most recent call last):
File "/Users/myuser/Documents/workspace/AcceptanceTesting/ingest.py", line 42, in <module>
pickle.dump(myd, myf, protocol=-1)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1370, in dump
Pickler(file, protocol).dump(obj)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 313, in save
(t.__name__, obj))
PicklingError: Can't pickle 'myDict' object: {}
您会看到原因是因为属性名称被 __getattribute__
这是
它:
是递归的,用于处理问题可能有很多层次的复杂对象。
输出采用
.x[i].y.z....
形式,以便您查看调用了哪些成员来解决问题。使用dict
它只是打印[key/val type=...]
而不是,因为键或值都可能是问题所在,使得更难(但并非不可能)引用dict
中的特定键或值。考虑了更多的类型,特别是
list
、tuple
和dict
,需要单独处理,因为它们没有[=18] =]属性。returns 所有问题,而不仅仅是第一个问题。
def get_unpicklable(instance, exception=None, string='', first_only=True):
"""
Recursively go through all attributes of instance and return a list of whatever
can't be pickled.
Set first_only to only print the first problematic element in a list, tuple or
dict (otherwise there could be lots of duplication).
"""
problems = []
if isinstance(instance, tuple) or isinstance(instance, list):
for k, v in enumerate(instance):
try:
pickle.dumps(v)
except BaseException as e:
problems.extend(get_unpicklable(v, e, string + f'[{k}]'))
if first_only:
break
elif isinstance(instance, dict):
for k in instance:
try:
pickle.dumps(k)
except BaseException as e:
problems.extend(get_unpicklable(
k, e, string + f'[key type={type(k).__name__}]'
))
if first_only:
break
for v in instance.values():
try:
pickle.dumps(v)
except BaseException as e:
problems.extend(get_unpicklable(
v, e, string + f'[val type={type(v).__name__}]'
))
if first_only:
break
else:
for k, v in instance.__dict__.items():
try:
pickle.dumps(v)
except BaseException as e:
problems.extend(get_unpicklable(v, e, string + '.' + k))
# if we get here, it means pickling instance caused an exception (string is not
# empty), yet no member was a problem (problems is empty), thus instance itself
# is the problem.
if string != '' and not problems:
problems.append(
string + f" (Type '{type(instance).__name__}' caused: {exception})"
)
return problems