如果我保留对底层迭代器的引用,为什么 islice(permutations) 会快 100 倍?

Why is islice(permutations) 100 times faster if I keep a reference to the underlying iterator?

如果我只保留对 permutations 迭代器的额外引用,islice(permutations(a), n) 的迭代速度会快 100 倍。在有和没有额外参考之间交替:

  2.1 ms  with
202.2 ms  without
  2.1 ms  with
195.8 ms  without
  2.1 ms  with
192.4 ms  without

怎么回事?

完整代码(Try it online!):

from timeit import timeit
from itertools import permutations, islice
from collections import deque

a = range(10 ** 7)
n = 10 ** 5

for label in ['with', 'without'] * 3:
    if label == 'with':
        perms = islice((foo := permutations(a)), n)
    else:
        perms = islice(permutations(a), n)
    next(perms)
    t = timeit(lambda: deque(perms, 0), number=1)
    print('%5.1f ms ' % (t * 1e3), label)

我才明白为什么。如果我不保留一个引用,那么迭代器和它所包含的所有内容都会在计时结束时被垃圾收集起来,并且包含在时间中。

请注意,我构建排列的列表非常大。所以每个排列都非常大。所以 permutations 迭代器有一个很大的结果元组和内部状态数据结构,我也有数百万个范围内的整数对象。所有必须清理的东西。

当我将 a 的大小减半到 a = range(10 ** 7 // 2) 时,“没有”额外参考的时间下降到大约一半(100 毫秒)。

当我将 a 的大小加倍到 a = range(10 ** 7 * 2) 时,“没有”额外参考的时间大约加倍(超过 400 毫秒)。

这两个变化都不影响“with”时间(总是在 2 毫秒左右)。


以防万一有人想知道为什么我要构建如此大的列表的排列:我是 looking into the time it takes permutations to provide all n! permutations of n elements. One might think it needs O(n × n!), since that's the overall result size. But it reuses and modifies its result tuple if it can, so it doesn't build each permutation from scratch but just needs to update it a little. So I tested that with large n in order to see a large speed difference between when it can and can't reuse its result tuple. It is indeed much faster if it can, and seems to take only O(n!) time to provide all permutations. It appears to on average change just 2.63 elements 从一个排列到下一个排列。