copy.copy 和 dataclasses.replace 之间的区别

Difference between copy.copy and dataclasses.replace

据我了解,dataclasses.replace(x)copy.copy(x) 的作用相同,只是它仅在 xdataclass 时才有效,并且提供替换成员的能力.但是我也注意到它快了大约 3 倍。我现在很好奇为什么 copy 会慢很多,以及这两个函数之间是否还有其他差异需要考虑。

import dataclasses
import time
import copy

@dataclasses.dataclass()
class X:
    x=1
    y=1
    z=1

x = X()

start = time.perf_counter()
for _ in range(100000):
    a = dataclasses.replace(x)
t1 = time.perf_counter() - start

start = time.perf_counter()
for _ in range(100000):
    a = copy.copy(x)
t2 = time.perf_counter() - start

print(t1)  # 0.4
print(t2)  # 1.2

Chris_Rands alluded to in their comment, copy.copy has quite a bit of extra logic needed to handle arbitrary Python objects—this extra logic likely accounts for the difference in speed. In contrast, dataclasses.replace can get away with only making a few checks since the function only needs to work for dataclasses. You can see how much simpler dataclasses.replace is than copy.copy (and the functions it calls) in the source code for dataclasses.py and copy.py.

如果您查看 copy.copy 的源代码,您会注意到复制 X 对象的代码归结为此。

def fastcopy(x):
    red = getattr(x,"__reduce_ex__")(4)
    return red[0](*red[1])

如果不进行额外检查,此 fastcopy 函数的性能似乎与 dataclasses.replace 相当。下面是我测试的完整代码,以及我得到的时间。

import dataclasses
import time
import copy

def fastcopy(x):
    red = getattr(x,"__reduce_ex__")(4)
    return red[0](*red[1])

@dataclasses.dataclass()
class X:
    x=1
    y=1
    z=1

x = X()

start = time.perf_counter()
for _ in range(100000):
    a = dataclasses.replace(x)
t1 = time.perf_counter() - start

start = time.perf_counter()
for _ in range(100000):
    a = fastcopy(x)
t2 = time.perf_counter() - start

print(t1) # 0.1
print(t2) # 0.1