pathlib.Path 的子类对象在 pickle.load 之后丢失了自定义属性
Subclass object of pathlib.Path gets custom attributes lost after pickle.load
from pathlib import Path
import pickle
class P(type(Path())):
def __init__(self, *args):
super().__init__()
self.a = ''
p = P()
p.a = 'x'
with open('xx', 'wb') as wf:
pickle.dump(p, wf)
p1 = pickle.load(open('xx', 'rb'))
print(p1.a) # here p1.a is ''
我正在创建 pathlib.Path
的子类,并希望向其添加一些自定义属性。
问题是自定义属性在 pickle 重新加载后丢失了。
如何解决这个问题。
我尝试过的其他解决方案:
- 使用
__slots__
,同样的问题。
- 使用组合而不是继承,然后通过实现
__getattr__
来分派 Path-like 方法。但是,在这种情况下,self.path
未在 pickle.load
内初始化,从而导致 __getattr__
. 的无休止调用
class File():
def __init__(self, *args):
self.path = Path(*args)
def __getattr__(self, item):
return getattr(self.path, item)
p = File('aaa')
p.exists() # no error
with open('xx', 'wb') as wf:
pickle.dump(p, wf)
p1 = pickle.load(open('xx', 'rb'))
# RecursionError: maximum recursion depth exceeded.
# This is due to call of self.path, in that moment, path is not in self.__dict__
一种方法是使用 copyreg
模块将 pickle 支持函数关联到您的 class 实例,如下所示。请注意,我还必须修改您的 P
class 处理参数的方式——它不再忽略它们。
import copyreg
from pathlib import Path
import pickle
class P(type(Path())):
def __init__(self, *args):
super().__init__()
self.a = args[0] if args else ''
def pickle_P(p):
print("pickling a P instance...")
return P, (p.a,)
copyreg.pickle(P, pickle_P)
p = P()
p.a = 'x'
q = P('y')
with open('xx', 'wb') as outp:
pickle.dump(p, outp)
pickle.dump(q, outp)
with open('xx', 'rb') as inp:
p1 = pickle.load(inp)
q1 = pickle.load(inp)
print('p1.a = {!r}'.format(p1.a))
print('q1.a = {!r}'.format(q1.a))
输出:
pickling a P instance...
pickling a P instance...
p1.a = 'x'
q1.a = 'y'
关于继承问题
作为@martineau 回答注释的另一种解决方案。
如果我是对的,问题是由pathlib.PosixPath
中的__reduce__
方法引起的。泡菜行为似乎将由这种方法决定。 @martineau 使用 copyreg.pickle(P, pickle_P)
的解决方案也与此方法有关:pickle_P
与 __reduce__
.
具有相同的 return 模式
这是关于 __reduce__
的 return 值的文档:
When a tuple is returned, it must be between two and six items long.
Optional items can either be omitted, or None can be provided as their
value. The semantics of each item are in order:
A callable object that will be called to create the initial version of the object.
A tuple of arguments for the callable object. An empty tuple must be given if the callable does not accept any argument.
Optionally, the object’s state, which will be passed to the object’s __setstate__()
method as previously described. If the object has no such method then, the value must be a dictionary and it will be added
to the object’s __dict__
attribute.
...
第二项解释了@martineau 的解决方案是如何工作的:第二个 return 值将被传递到 __init__
。
这是PosixPath.__reduce__
的源代码
def __reduce__(self):
# Using the parts tuple helps share interned path parts
# when pickling related paths.
# self._parts is arguments passed to Path
return (self.__class__, tuple(self._parts))
根据第三个return值的描述,解法为:
from pathlib import Path
import pickle
class P(type(Path())):
def __init__(self, *args):
super().__init__()
self.a = ''
def __reduce__(self):
return self.__class__, tuple(self._parts), self.__dict__
p = P()
p.a = 'x'
with open('xx', 'wb') as wf:
pickle.dump(p, wf)
p1 = pickle.load(open('xx', 'rb'))
print(p1.a) # here p1.a is 'x'
此解决方案的缺点:
- P 的实例将包含一个
__dict__
属性(Path
使用 __slots__
)。
- 名为
_hash
的属性将被忽略。
作文题
pickle 文档中的注释可能会解释错误原因。
Note At unpickling time, some methods like __getattr__()
,
__getattribute__()
, or __setattr__()
may be called upon the instance. In case those methods rely on some internal invariant being true, the
type should implement __new__()
to establish such an invariant, as
__init__()
is not called when unpickling an instance.
为了确保在调用__getattr__
时存在path
属性,一个解决方案是将属性赋值移动到__new__
方法中(在__init__
之前)。
class File():
def __new__(cls, *args):
obj = super().__new__(cls)
obj.path = Path(*args)
return obj
def __getattr__(self, item):
return getattr(self.path, item)
p = File('aaa')
p.exists() # no error
with open('xx', 'wb') as wf:
pickle.dump(p, wf)
p1 = pickle.load(open('xx', 'rb')) # no error
from pathlib import Path
import pickle
class P(type(Path())):
def __init__(self, *args):
super().__init__()
self.a = ''
p = P()
p.a = 'x'
with open('xx', 'wb') as wf:
pickle.dump(p, wf)
p1 = pickle.load(open('xx', 'rb'))
print(p1.a) # here p1.a is ''
我正在创建 pathlib.Path
的子类,并希望向其添加一些自定义属性。
问题是自定义属性在 pickle 重新加载后丢失了。
如何解决这个问题。
我尝试过的其他解决方案:
- 使用
__slots__
,同样的问题。 - 使用组合而不是继承,然后通过实现
__getattr__
来分派 Path-like 方法。但是,在这种情况下,self.path
未在pickle.load
内初始化,从而导致__getattr__
. 的无休止调用
class File():
def __init__(self, *args):
self.path = Path(*args)
def __getattr__(self, item):
return getattr(self.path, item)
p = File('aaa')
p.exists() # no error
with open('xx', 'wb') as wf:
pickle.dump(p, wf)
p1 = pickle.load(open('xx', 'rb'))
# RecursionError: maximum recursion depth exceeded.
# This is due to call of self.path, in that moment, path is not in self.__dict__
一种方法是使用 copyreg
模块将 pickle 支持函数关联到您的 class 实例,如下所示。请注意,我还必须修改您的 P
class 处理参数的方式——它不再忽略它们。
import copyreg
from pathlib import Path
import pickle
class P(type(Path())):
def __init__(self, *args):
super().__init__()
self.a = args[0] if args else ''
def pickle_P(p):
print("pickling a P instance...")
return P, (p.a,)
copyreg.pickle(P, pickle_P)
p = P()
p.a = 'x'
q = P('y')
with open('xx', 'wb') as outp:
pickle.dump(p, outp)
pickle.dump(q, outp)
with open('xx', 'rb') as inp:
p1 = pickle.load(inp)
q1 = pickle.load(inp)
print('p1.a = {!r}'.format(p1.a))
print('q1.a = {!r}'.format(q1.a))
输出:
pickling a P instance...
pickling a P instance...
p1.a = 'x'
q1.a = 'y'
关于继承问题
作为@martineau 回答注释的另一种解决方案。
如果我是对的,问题是由pathlib.PosixPath
中的__reduce__
方法引起的。泡菜行为似乎将由这种方法决定。 @martineau 使用 copyreg.pickle(P, pickle_P)
的解决方案也与此方法有关:pickle_P
与 __reduce__
.
这是关于 __reduce__
的 return 值的文档:
When a tuple is returned, it must be between two and six items long. Optional items can either be omitted, or None can be provided as their value. The semantics of each item are in order:
A callable object that will be called to create the initial version of the object.
A tuple of arguments for the callable object. An empty tuple must be given if the callable does not accept any argument.
Optionally, the object’s state, which will be passed to the object’s
__setstate__()
method as previously described. If the object has no such method then, the value must be a dictionary and it will be added to the object’s__dict__
attribute....
第二项解释了@martineau 的解决方案是如何工作的:第二个 return 值将被传递到 __init__
。
这是PosixPath.__reduce__
def __reduce__(self):
# Using the parts tuple helps share interned path parts
# when pickling related paths.
# self._parts is arguments passed to Path
return (self.__class__, tuple(self._parts))
根据第三个return值的描述,解法为:
from pathlib import Path
import pickle
class P(type(Path())):
def __init__(self, *args):
super().__init__()
self.a = ''
def __reduce__(self):
return self.__class__, tuple(self._parts), self.__dict__
p = P()
p.a = 'x'
with open('xx', 'wb') as wf:
pickle.dump(p, wf)
p1 = pickle.load(open('xx', 'rb'))
print(p1.a) # here p1.a is 'x'
此解决方案的缺点:
- P 的实例将包含一个
__dict__
属性(Path
使用__slots__
)。 - 名为
_hash
的属性将被忽略。
作文题
pickle 文档中的注释可能会解释错误原因。
Note At unpickling time, some methods like
__getattr__()
,__getattribute__()
, or__setattr__()
may be called upon the instance. In case those methods rely on some internal invariant being true, the type should implement__new__()
to establish such an invariant, as__init__()
is not called when unpickling an instance.
为了确保在调用__getattr__
时存在path
属性,一个解决方案是将属性赋值移动到__new__
方法中(在__init__
之前)。
class File():
def __new__(cls, *args):
obj = super().__new__(cls)
obj.path = Path(*args)
return obj
def __getattr__(self, item):
return getattr(self.path, item)
p = File('aaa')
p.exists() # no error
with open('xx', 'wb') as wf:
pickle.dump(p, wf)
p1 = pickle.load(open('xx', 'rb')) # no error