使用 python deepcopy 时出现 AttributeError

AttributeError when using python deepcopy

我有一个 class 覆盖了 __eq____hash__,以使其对象充当字典键。每个对象还带有一个字典,由相同 class 的其他对象作为关键字。当我尝试 deepcopy 整个结构时,我得到一个奇怪的 AttributeError。我在 OsX.

上使用 Python 3.6.0

Python docs 看来 deepcopy 使用 memo 字典来缓存它已经复制的对象,因此嵌套结构应该不是问题。那我做错了什么?我应该编写自己的 __deepcopy__ 方法来解决这个问题吗?怎么样?

from copy import deepcopy


class Node:

    def __init__(self, p_id):
        self.id = p_id
        self.edge_dict = {}
        self.degree = 0

    def __eq__(self, other):
        return self.id == other.id

    def __hash__(self):
        return hash(self.id)

    def add_edge(self, p_node, p_data):
        if p_node not in self.edge_dict:
            self.edge_dict[p_node] = p_data
            self.degree += 1
            return True
        else:
            return False

if __name__ == '__main__':
    node1 = Node(1)
    node2 = Node(2)
    node1.add_edge(node2, "1->2")
    node2.add_edge(node1, "2->1")
    node1_copy = deepcopy(node1)

File ".../node_test.py", line 15, in __hash__
    return hash(self.id)
AttributeError: 'Node' object has no attribute 'id'

循环依赖性是 deepcopy 的一个问题,当您:

  1. 有 class 必须散列且包含循环引用,并且
  2. 不要确保在对象构造时建立哈希相关(和等式相关)不变量,而不仅仅是初始化

问题是unpickling一个对象(deepcopy,默认情况下,通过pickling和unpickling复制自定义对象,除非定义了特殊的__deepcopy__方法)创建空对象而不初始化它,然后尝试一个一个地填充它的属性。当它试图填充 node1 的属性时,它需要初始化 node2,这又依赖于部分创建的 node1(在这两种情况下,由于 edge_dict ).当它试图为一个 Node 填充 edge_dict 时,它添加到 edge_dictNode 还没有设置其 id 属性,所以对其进行散列的尝试失败了。

您可以通过使用 __new__ 来纠正这个问题,以确保在初始化可变的、可能是递归的属性之前建立不变量,并定义 pickle 帮助程序 __getnewargs__(或 __getnewargs_ex__) 以使其正确使用它们。具体来说,将您的 class 定义更改为:

class Node:
    # __new__ instead of __init__ to establish necessary id invariant
    # You could use both __new__ and __init__, but that's usually more complicated
    # than you really need
    def __new__(cls, p_id):
        self = super().__new__(cls)  # Must explicitly create the new object
        # Aside from explicit construction and return, rest of __new__
        # is same as __init__
        self.id = p_id
        self.edge_dict = {}
        self.degree = 0
        return self  # __new__ returns the new object

    def __getnewargs__(self):
        # Return the arguments that *must* be passed to __new__
        return (self.id,)

    # ... rest of class is unchanged ...

注意:如果这是Python 2代码,请确保显式继承自object并在__new__中将super()更改为super(Node, cls);给出的代码是更简单的 Python 3 代码。

仅处理 copy.deepcopy 的替代解决方案,不支持 pickling 或要求使用 __new__/__getnewargs__(这需要新的-style classes) 将仅覆盖深度复制。您将在原始 class 上定义以下额外方法(并确保模块导入 copy),否则保持不变:

def __deepcopy__(self, memo):
    # Deepcopy only the id attribute, then construct the new instance and map
    # the id() of the existing copy to the new instance in the memo dictionary
    memo[id(self)] = newself = self.__class__(copy.deepcopy(self.id, memo))
    # Now that memo is populated with a hashable instance, copy the other attributes:
    newself.degree = copy.deepcopy(self.degree, memo)
    # Safe to deepcopy edge_dict now, because backreferences to self will
    # be remapped to newself automatically
    newself.edge_dict = copy.deepcopy(self.edge_dict, memo)
    return newself