"Tuple comprehensions" 和星号 splat/unpack 运算符 *
"Tuple comprehensions" and the star splat/unpack operator *
我刚刚读到问题 为什么 Python 中没有元组理解?
在comments of the accepted answer中表示没有真正的"tuple comprehensions"。相反,我们当前的选择是使用生成器表达式并将生成的生成器对象传递给元组构造函数:
tuple(thing for thing in things)
或者,我们可以使用列表理解创建一个列表,然后将列表传递给元组构造函数:
tuple([thing for thing in things])
最后,与接受的答案相反,a more recent answer 使用以下语法声明元组理解确实是一件事(自 Python 3.5 起):
*(thing for thing in things),
对我来说,第二个示例似乎也是首先创建生成器对象的示例。这个对吗?
这些表达在幕后发生的事情上有什么区别吗?在性能方面?我假设第一个和第三个可能有延迟问题,而第二个可能有内存问题(如链接评论中所讨论)。
- 比较第一个和最后一个,哪个更pythonic?
更新:
不出所料,列表推导确实快多了。我不明白为什么第一个比第三个快。有什么想法吗?
>>> from timeit import timeit
>>> a = 'tuple(i for i in range(10000))'
>>> b = 'tuple([i for i in range(10000)])'
>>> c = '*(i for i in range(10000)),'
>>> print('A:', timeit(a, number=1000000))
>>> print('B:', timeit(b, number=1000000))
>>> print('C:', timeit(c, number=1000000))
A: 438.98362647295824
B: 271.7554752581845
C: 455.59842588083677
To me, it seems like the second example is also one where a generator
object is created first. Is this correct?
是的,你是对的,检查 CPython 字节码:
>>> import dis
>>> dis.dis("*(thing for thing in thing),")
1 0 LOAD_CONST 0 (<code object <genexpr> at 0x7f56e9347ed0, file "<dis>", line 1>)
2 LOAD_CONST 1 ('<genexpr>')
4 MAKE_FUNCTION 0
6 LOAD_NAME 0 (thing)
8 GET_ITER
10 CALL_FUNCTION 1
12 BUILD_TUPLE_UNPACK 1
14 POP_TOP
16 LOAD_CONST 2 (None)
18 RETURN_VALUE
Is there any difference between these expressions in terms of what
goes on behind the scenes? In terms of performance? I assume the first
and third could have latency issues while the second could have memory
issues (as is discussed in the linked comments).
我的计时表明第一个 1 稍微快一点,大概是因为通过 BUILD_TUPLE_UNPACK
的解包比 tuple()
调用更昂贵:
>>> from timeit import timeit
>>> def f1(): tuple(thing for thing in range(100000))
...
>>> def f2(): *(thing for thing in range(100000)),
...
>>> timeit(lambda: f1(), number=100)
0.5535585517063737
>>> timeit(lambda: f2(), number=100)
0.6043887557461858
Comparing the first one and the last, which one is more pythonic?
第一个对我来说似乎更具可读性,并且也适用于不同的 Python 版本。
我刚刚读到问题 为什么 Python 中没有元组理解?
在comments of the accepted answer中表示没有真正的"tuple comprehensions"。相反,我们当前的选择是使用生成器表达式并将生成的生成器对象传递给元组构造函数:
tuple(thing for thing in things)
或者,我们可以使用列表理解创建一个列表,然后将列表传递给元组构造函数:
tuple([thing for thing in things])
最后,与接受的答案相反,a more recent answer 使用以下语法声明元组理解确实是一件事(自 Python 3.5 起):
*(thing for thing in things),
对我来说,第二个示例似乎也是首先创建生成器对象的示例。这个对吗?
这些表达在幕后发生的事情上有什么区别吗?在性能方面?我假设第一个和第三个可能有延迟问题,而第二个可能有内存问题(如链接评论中所讨论)。
- 比较第一个和最后一个,哪个更pythonic?
更新:
不出所料,列表推导确实快多了。我不明白为什么第一个比第三个快。有什么想法吗?
>>> from timeit import timeit
>>> a = 'tuple(i for i in range(10000))'
>>> b = 'tuple([i for i in range(10000)])'
>>> c = '*(i for i in range(10000)),'
>>> print('A:', timeit(a, number=1000000))
>>> print('B:', timeit(b, number=1000000))
>>> print('C:', timeit(c, number=1000000))
A: 438.98362647295824
B: 271.7554752581845
C: 455.59842588083677
To me, it seems like the second example is also one where a generator object is created first. Is this correct?
是的,你是对的,检查 CPython 字节码:
>>> import dis
>>> dis.dis("*(thing for thing in thing),")
1 0 LOAD_CONST 0 (<code object <genexpr> at 0x7f56e9347ed0, file "<dis>", line 1>)
2 LOAD_CONST 1 ('<genexpr>')
4 MAKE_FUNCTION 0
6 LOAD_NAME 0 (thing)
8 GET_ITER
10 CALL_FUNCTION 1
12 BUILD_TUPLE_UNPACK 1
14 POP_TOP
16 LOAD_CONST 2 (None)
18 RETURN_VALUE
Is there any difference between these expressions in terms of what goes on behind the scenes? In terms of performance? I assume the first and third could have latency issues while the second could have memory issues (as is discussed in the linked comments).
我的计时表明第一个 1 稍微快一点,大概是因为通过 BUILD_TUPLE_UNPACK
的解包比 tuple()
调用更昂贵:
>>> from timeit import timeit
>>> def f1(): tuple(thing for thing in range(100000))
...
>>> def f2(): *(thing for thing in range(100000)),
...
>>> timeit(lambda: f1(), number=100)
0.5535585517063737
>>> timeit(lambda: f2(), number=100)
0.6043887557461858
Comparing the first one and the last, which one is more pythonic?
第一个对我来说似乎更具可读性,并且也适用于不同的 Python 版本。