如何使用对象的内部引用克隆/深度复制 Python 3.x 字典

How to clone / deepcopy Python 3.x dict with internal references to the objects

我有以下问题。假设我们有 class A 和 class B:

class A:

    def clone(self):

        return self.__class__()

class B:

    def __init__(self, ref):

        self.ref = ref

    def clone(self):

        return self.__class__(
            ref = self.ref
        )

我还有一个 class 在 dict 之后继承,叫做 Holder.

class Holder(dict):

    def clone(self):

        return self.__class__(
            {k: v.clone() for k, v in self.items()}
        )

现在我想要的是使用我的 clone() 函数以某种方式复制整个 dict(值已经放在里面),这样引用就不会搞砸了。

这里有一些代码可以阐明我想要的行为:

original = Holder()
original['a'] = A()
original['b'] = B(original['a'])  # here we create B object 
                                  # with reference to A object

assert original['a'] is original['b'].ref  # reference is working

copy = original.clone()  # we clone our dict

assert copy['a'] is copy['b'].ref  # reference is not working like I want
                                   # copy['b'].ref points to old original['b']

assert original['a'] is not copy['a']
assert original['b'] is not copy['b']
assert original['b'].ref is not copy['b'].ref

下面是问题的一些背景:

Let's say that I have a class called MyClass and metaclass called MyClassMeta.

I want to supply the __prepare__ function of MyClassMeta with my own dict that will be the instance of class called Holder. During the class creation I will be storing values of certain types to the internal dict of Holder instance (similarly to what EnumMeta does). Since the Holder instance will be filled with values during the class creation all instances of MyClass will be have a reference to the same object.

Now what I want is to have the separate copy per instance of my Holder. I thought that I can just copy/clone my object but the problem came up when I added object that referenced other object inside the same dict.

在 Python 中克隆自定义数据结构的正确方法是实现 __deepcopy__ 特殊方法。这就是 copy.deepcopy 函数调用的内容。

doc 中所述:

Two problems often exist with deep copy operations that don’t exist with shallow copy operations:

  • Recursive objects (compound objects that, directly or indirectly, contain a reference to themselves) may cause a recursive loop.
  • Because deep copy copies everything it may copy too much, such as data which is intended to be shared between copies. [This is the problem you are facing]

The deepcopy() function avoids these problems by:

  • keeping a “memo” dictionary of objects already copied during the current copying pass; and
  • letting user-defined classes override the copying operation or the set of components copied.

代码

import copy

class A:
    def __deepcopy__(self, memo):
        return self.__class__()

class B:
    def __init__(self, ref):
        self.ref = ref

    def __deepcopy__(self, memo):
        return self.__class__(
            ref=copy.deepcopy(self.ref, memo)
        )

class Holder(dict):
    def __deepcopy__(self, memo):
        return self.__class__(
            {k: copy.deepcopy(v, memo) for k, v in self.items()}
        )

测试

import copy

original = Holder()
original['a'] = A()
original['b'] = B(original['a'])  # here we create B object
                                  # with reference to A object

assert original['a'] is original['b'].ref  # reference is working

cp = copy.deepcopy(original)  # we clone our dict

assert cp['a'] is cp['b'].ref  # reference is still working

assert original['a'] is not cp['a']
assert original['b'] is not cp['b']
assert original['b'].ref is not cp['b'].ref