itertools 与用于展平和重复的生成器表达式 - 我们如何解释这些计时结果?

itertools vs. generator expressions for flattening and repetition - how can we explain these timing results?

上讨论的启发,我决定尝试一些性能测试。我设置的任务稍微简单一些——给定一个源列表 A,我们希望创建一个惰性可迭代对象,它重复 A 的每个元素 N 次:

def test(implementation):
    A, N = list('abc'), 3
    assert list(implementation(A, N)) == list('aaabbbccc')

我想出了几个实现,并测试了它们:

from itertools import chain, repeat, starmap
from timeit import timeit

flatten = chain.from_iterable

def consume(iterable):
    for _ in iterable:
        pass

# FAST approaches
def tools(original, count):
    return flatten(map(repeat, original, repeat(count)))

def tools_star(original, count):
    return flatten(starmap(repeat, zip(original, repeat(count))))

def mixed(original, count):
    return flatten(repeat(a, count) for a in original)

# SLOW approaches
def mixed2(original, count):
    return (x for a in original for x in repeat(a, count))

def explicit(original, count):
    for a in original:
        for _ in range(count):
            yield a

def generator(original, count):
    return (a for a in original for _ in range(count))

def mixed3(original, count):
    return flatten((a for _ in range(count)) for a in original)

if __name__ == '__main__':
    for impl in (tools, tools_star, mixed, mixed2, explicit, generator, mixed3):
        for consumption in (consume, list):
            to_time = lambda: consumption(impl(list(range(1000)), 1000))
            elapsed = timeit(to_time, number=100)
            print(f'{consumption.__name__}({impl.__name__}): {elapsed:.2f}')

以下是我机器上的三个计时结果示例:

consume(tools): 1.10
list(tools): 2.96
consume(tools_star): 1.10
list(tools_star): 2.97
consume(mixed): 1.11
list(mixed): 2.91
consume(mixed2): 4.60
list(mixed2): 6.53
consume(explicit): 5.45
list(explicit): 8.09
consume(generator): 5.98
list(generator): 7.62
consume(mixed3): 5.75
list(mixed3): 7.67

consume(tools): 1.10
list(tools): 2.88
consume(tools_star): 1.10
list(tools_star): 2.89
consume(mixed): 1.11
list(mixed): 2.87
consume(mixed2): 4.56
list(mixed2): 6.39
consume(explicit): 5.42
list(explicit): 7.24
consume(generator): 5.91
list(generator): 7.48
consume(mixed3): 5.80
list(mixed3): 7.61

consume(tools): 1.14
list(tools): 2.98
consume(tools_star): 1.10
list(tools_star): 2.90
consume(mixed): 1.11
list(mixed): 2.92
consume(mixed2): 4.76
list(mixed2): 6.49
consume(explicit): 5.69
list(explicit): 7.38
consume(generator): 5.68
list(generator): 7.52
consume(mixed3): 5.75
list(mixed3): 7.86

由此我得出以下结论:

itertools 背后到底发生了什么?我们如何解释这些时序结果?尤其奇怪的是 chain.from_iterablerepeat 在这里不提供增量性能优势,而是完全相互依赖。列表构建是怎么回事?在每种情况下增加的开销不是相同的(重复将相同的元素序列附加到空列表)吗?

主要 归结为花在解释器上的时间的大 O。

  • 解释器中没有循环允许 C 函数直接通信。
    • 但嵌套如此多的 itertools 会增加少量但可衡量的开销。
      • I在table下方。
  • 一个循环只是几个操作码的×1000。
  • 嵌套循环高达 1000000 倍。
  • 直接从 repeat 生成比在生成之前从 range 存储少几个操作码。
    • Y在table下方。
  • explicitgenerator 实际上是等价的。
  • 嵌套生成器是一种嵌套函数调用——代价高昂。
    • G在table下方。

这是我的结果:

method complexity (interpreter only) list consume
tools O(0) + I 0.21 0.09
tools_star O(0) + I 0.21 0.09
mixed O(N) 0.20 0.09
mixed2 O(N²) 0.54 0.47
explicit O(N²) + Y 0.64 0.60
generator O(N²) + Y 0.64 0.60
tools O(N²) + G 0.71 0.65

看起来很有说服力。

另外,我对代码做了一些修改,以改进计时方法和可读性。

PRODUCERS = (tools, tools_star, mixed, mixed2, explicit, generator, mixed3)
CONSUMERS = (list, consume)
N = 1000
SAMPLES = 50
BATCH = 10

if __name__ == '__main__':
    for consumer in CONSUMERS:
        print(f"{consumer.__name__}:")
        for producer in PRODUCERS:
            to_time = lambda: consumer(producer(list(range(N)), N))
            elapsed = timeit.repeat(to_time, repeat=SAMPLES, number=BATCH)
            emin = min(elapsed) # get rid of the random fluctuations for the best theoretical time
            print(f'{emin:02.2f} | {producer.__name__}')
        print()