使用 python deepcopy 时出现 AttributeError
AttributeError when using python deepcopy
我有一个 class 覆盖了 __eq__
和 __hash__
,以使其对象充当字典键。每个对象还带有一个字典,由相同 class 的其他对象作为关键字。当我尝试 deepcopy
整个结构时,我得到一个奇怪的 AttributeError
。我在 OsX.
上使用 Python 3.6.0
从 Python docs 看来 deepcopy
使用 memo
字典来缓存它已经复制的对象,因此嵌套结构应该不是问题。那我做错了什么?我应该编写自己的 __deepcopy__
方法来解决这个问题吗?怎么样?
from copy import deepcopy
class Node:
def __init__(self, p_id):
self.id = p_id
self.edge_dict = {}
self.degree = 0
def __eq__(self, other):
return self.id == other.id
def __hash__(self):
return hash(self.id)
def add_edge(self, p_node, p_data):
if p_node not in self.edge_dict:
self.edge_dict[p_node] = p_data
self.degree += 1
return True
else:
return False
if __name__ == '__main__':
node1 = Node(1)
node2 = Node(2)
node1.add_edge(node2, "1->2")
node2.add_edge(node1, "2->1")
node1_copy = deepcopy(node1)
File ".../node_test.py", line 15, in __hash__
return hash(self.id)
AttributeError: 'Node' object has no attribute 'id'
循环依赖性是 deepcopy
的一个问题,当您:
- 有 class 必须散列且包含循环引用,并且
- 不要确保在对象构造时建立哈希相关(和等式相关)不变量,而不仅仅是初始化
问题是unpickling一个对象(deepcopy
,默认情况下,通过pickling和unpickling复制自定义对象,除非定义了特殊的__deepcopy__
方法)创建空对象而不初始化它,然后尝试一个一个地填充它的属性。当它试图填充 node1
的属性时,它需要初始化 node2
,这又依赖于部分创建的 node1
(在这两种情况下,由于 edge_dict
).当它试图为一个 Node
填充 edge_dict
时,它添加到 edge_dict
的 Node
还没有设置其 id
属性,所以对其进行散列的尝试失败了。
您可以通过使用 __new__
来纠正这个问题,以确保在初始化可变的、可能是递归的属性之前建立不变量,并定义 pickle
帮助程序 __getnewargs__
(或 __getnewargs_ex__
) 以使其正确使用它们。具体来说,将您的 class 定义更改为:
class Node:
# __new__ instead of __init__ to establish necessary id invariant
# You could use both __new__ and __init__, but that's usually more complicated
# than you really need
def __new__(cls, p_id):
self = super().__new__(cls) # Must explicitly create the new object
# Aside from explicit construction and return, rest of __new__
# is same as __init__
self.id = p_id
self.edge_dict = {}
self.degree = 0
return self # __new__ returns the new object
def __getnewargs__(self):
# Return the arguments that *must* be passed to __new__
return (self.id,)
# ... rest of class is unchanged ...
注意:如果这是Python 2代码,请确保显式继承自object
并在__new__
中将super()
更改为super(Node, cls)
;给出的代码是更简单的 Python 3 代码。
仅处理 copy.deepcopy
的替代解决方案,不支持 pickling 或要求使用 __new__
/__getnewargs__
(这需要新的-style classes) 将仅覆盖深度复制。您将在原始 class 上定义以下额外方法(并确保模块导入 copy
),否则保持不变:
def __deepcopy__(self, memo):
# Deepcopy only the id attribute, then construct the new instance and map
# the id() of the existing copy to the new instance in the memo dictionary
memo[id(self)] = newself = self.__class__(copy.deepcopy(self.id, memo))
# Now that memo is populated with a hashable instance, copy the other attributes:
newself.degree = copy.deepcopy(self.degree, memo)
# Safe to deepcopy edge_dict now, because backreferences to self will
# be remapped to newself automatically
newself.edge_dict = copy.deepcopy(self.edge_dict, memo)
return newself
我有一个 class 覆盖了 __eq__
和 __hash__
,以使其对象充当字典键。每个对象还带有一个字典,由相同 class 的其他对象作为关键字。当我尝试 deepcopy
整个结构时,我得到一个奇怪的 AttributeError
。我在 OsX.
从 Python docs 看来 deepcopy
使用 memo
字典来缓存它已经复制的对象,因此嵌套结构应该不是问题。那我做错了什么?我应该编写自己的 __deepcopy__
方法来解决这个问题吗?怎么样?
from copy import deepcopy
class Node:
def __init__(self, p_id):
self.id = p_id
self.edge_dict = {}
self.degree = 0
def __eq__(self, other):
return self.id == other.id
def __hash__(self):
return hash(self.id)
def add_edge(self, p_node, p_data):
if p_node not in self.edge_dict:
self.edge_dict[p_node] = p_data
self.degree += 1
return True
else:
return False
if __name__ == '__main__':
node1 = Node(1)
node2 = Node(2)
node1.add_edge(node2, "1->2")
node2.add_edge(node1, "2->1")
node1_copy = deepcopy(node1)
File ".../node_test.py", line 15, in __hash__
return hash(self.id)
AttributeError: 'Node' object has no attribute 'id'
循环依赖性是 deepcopy
的一个问题,当您:
- 有 class 必须散列且包含循环引用,并且
- 不要确保在对象构造时建立哈希相关(和等式相关)不变量,而不仅仅是初始化
问题是unpickling一个对象(deepcopy
,默认情况下,通过pickling和unpickling复制自定义对象,除非定义了特殊的__deepcopy__
方法)创建空对象而不初始化它,然后尝试一个一个地填充它的属性。当它试图填充 node1
的属性时,它需要初始化 node2
,这又依赖于部分创建的 node1
(在这两种情况下,由于 edge_dict
).当它试图为一个 Node
填充 edge_dict
时,它添加到 edge_dict
的 Node
还没有设置其 id
属性,所以对其进行散列的尝试失败了。
您可以通过使用 __new__
来纠正这个问题,以确保在初始化可变的、可能是递归的属性之前建立不变量,并定义 pickle
帮助程序 __getnewargs__
(或 __getnewargs_ex__
) 以使其正确使用它们。具体来说,将您的 class 定义更改为:
class Node:
# __new__ instead of __init__ to establish necessary id invariant
# You could use both __new__ and __init__, but that's usually more complicated
# than you really need
def __new__(cls, p_id):
self = super().__new__(cls) # Must explicitly create the new object
# Aside from explicit construction and return, rest of __new__
# is same as __init__
self.id = p_id
self.edge_dict = {}
self.degree = 0
return self # __new__ returns the new object
def __getnewargs__(self):
# Return the arguments that *must* be passed to __new__
return (self.id,)
# ... rest of class is unchanged ...
注意:如果这是Python 2代码,请确保显式继承自object
并在__new__
中将super()
更改为super(Node, cls)
;给出的代码是更简单的 Python 3 代码。
仅处理 copy.deepcopy
的替代解决方案,不支持 pickling 或要求使用 __new__
/__getnewargs__
(这需要新的-style classes) 将仅覆盖深度复制。您将在原始 class 上定义以下额外方法(并确保模块导入 copy
),否则保持不变:
def __deepcopy__(self, memo):
# Deepcopy only the id attribute, then construct the new instance and map
# the id() of the existing copy to the new instance in the memo dictionary
memo[id(self)] = newself = self.__class__(copy.deepcopy(self.id, memo))
# Now that memo is populated with a hashable instance, copy the other attributes:
newself.degree = copy.deepcopy(self.degree, memo)
# Safe to deepcopy edge_dict now, because backreferences to self will
# be remapped to newself automatically
newself.edge_dict = copy.deepcopy(self.edge_dict, memo)
return newself