List Comprehension 和 Generators 避免在使用条件表达式时计算相同的值两次
List Comprehension and Generators to avoid computing the same value twice when using conditional expressions
假设您有一些昂贵的 cpu 密集型函数,例如解析 xml 字符串。在这种情况下,我们的简单函数将是:
def parse(foo):
return int(foo)
作为输入,您有一个字符串列表,您想要解析它们并找到满足某些条件的已解析字符串的子集。理想情况下,我们只想对每个字符串执行一次解析。
如果没有列表理解,您可以:
olds = ["1", "2", "3", "4", "5"]
news = []
for old in olds:
new = parse(old) # First and only Parse
if new > 3:
news.append(new)
要将此作为列表推导来执行,您似乎必须执行两次解析,一次获取新值,一次执行条件检查:
olds = ["1", "2", "3", "4", "5"]
news = [
parse(new) # First Parse
for new in olds
if parse(new) > 3 # Second Parse
]
例如,此语法将不起作用:
olds = ["1", "2", "3", "4", "5"]
# Raises SyntaxError: can't assign to function call
news = [i for parse(i) in olds if i > 5]
使用生成器似乎可行:
def parse(strings):
for string in strings:
yield int(string)
olds = ["1", "2", "3", "4", "5"]
news = [i for i in parse(olds) if i > 3]
但是你可以在生成器中抛出条件:
def parse(strings):
for string in strings:
val = int(string)
if val > 3:
yield val
olds = ["1", "2", "3", "4", "5"]
news = [i for i in parse(olds)]
我想知道的是,就优化(不是可重用性等)而言,哪个更好,解析发生在生成器中但条件检查发生在列表推导中,还是一个在生成器中同时进行解析和条件检查的地方?有没有比这两种方法更好的替代方法?
这是 Python 3.6.5 中 dis.dis
的一些输出。请注意,在我的 Python 版本中,为了反汇编列表理解,我们必须使用 f.__code__.co_consts[1]
。检查此 以获得解释。
Generator 进行解析,List Comprehension 进行条件检查
def parse(strings):
for string in strings:
yield int(string)
def main(strings):
return [i for i in parse(strings) if i > 3]
assert main(["1", "2", "3", "4", "5"]) == [4, 5]
dis.dis(main.__code__.co_consts[1])
"""
2 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (3)
12 COMPARE_OP 4 (>)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
"""
dis.dis(parse)
"""
2 0 SETUP_LOOP 22 (to 24)
2 LOAD_FAST 0 (strings)
4 GET_ITER
>> 6 FOR_ITER 14 (to 22)
8 STORE_FAST 1 (string)
3 10 LOAD_GLOBAL 0 (int)
12 LOAD_FAST 1 (string)
14 CALL_FUNCTION 1
16 YIELD_VALUE
18 POP_TOP
20 JUMP_ABSOLUTE 6
>> 22 POP_BLOCK
>> 24 LOAD_CONST 0 (None)
26 RETURN_VALUE
"""
生成器进行解析和条件检查
def parse(strings):
for string in strings:
val = int(string)
if val > 3:
yield val
def main(strings):
return [i for i in parse(strings)]
assert main(["1", "2", "3", "4", "5"]) == [4, 5]
dis.dis(main.__code__.co_consts[1])
"""
2 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 8 (to 14)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LIST_APPEND 2
12 JUMP_ABSOLUTE 4
>> 14 RETURN_VALUE
"""
dis.dis(parse)
"""
2 0 SETUP_LOOP 34 (to 36)
2 LOAD_FAST 0 (strings)
4 GET_ITER
>> 6 FOR_ITER 26 (to 34)
8 STORE_FAST 1 (string)
3 10 LOAD_GLOBAL 0 (int)
12 LOAD_FAST 1 (string)
14 CALL_FUNCTION 1
16 STORE_FAST 2 (val)
4 18 LOAD_FAST 2 (val)
20 LOAD_CONST 1 (3)
22 COMPARE_OP 4 (>)
24 POP_JUMP_IF_FALSE 6
5 26 LOAD_FAST 2 (val)
28 YIELD_VALUE
30 POP_TOP
32 JUMP_ABSOLUTE 6
>> 34 POP_BLOCK
>> 36 LOAD_CONST 0 (None)
38 RETURN_VALUE
天真的紧环
def parse(string):
return int(string)
def main(strings):
values = []
for string in strings:
value = parse(string)
if value > 3:
values.append(value)
return values
assert main(["1", "2", "3", "4", "5"]) == [4, 5]
dis.dis(main)
"""
2 0 BUILD_LIST 0
2 STORE_FAST 1 (values)
3 4 SETUP_LOOP 38 (to 44)
6 LOAD_FAST 0 (strings)
8 GET_ITER
>> 10 FOR_ITER 30 (to 42)
12 STORE_FAST 2 (string)
4 14 LOAD_GLOBAL 0 (parse)
16 LOAD_FAST 2 (string)
18 CALL_FUNCTION 1
20 STORE_FAST 3 (value)
5 22 LOAD_FAST 3 (value)
24 LOAD_CONST 1 (3)
26 COMPARE_OP 4 (>)
28 POP_JUMP_IF_FALSE 10
6 30 LOAD_FAST 1 (values)
32 LOAD_ATTR 1 (append)
34 LOAD_FAST 3 (value)
36 CALL_FUNCTION 1
38 POP_TOP
40 JUMP_ABSOLUTE 10
>> 42 POP_BLOCK
7 >> 44 LOAD_FAST 1 (values)
46 RETURN_VALUE
"""
dis.dis(parse)
"""
2 0 LOAD_GLOBAL 0 (int)
2 LOAD_FAST 0 (string)
4 CALL_FUNCTION 1
6 RETURN_VALUE
"""
注意前两个使用列表推导和生成器的反汇编如何指示两个 for 循环,一个在主循环(列表推导)中,一个在解析(生成器)中。这并不像听起来那么糟糕,对吧?例如,整个操作是 O(n) 而不是 O(n^2) ?
编辑:这是 khelwood 的解决方案:
def parse(string):
return int(string)
def main(strings):
return [val for val in (parse(string) for string in strings) if val > 3]
assert main(["1", "2", "3", "4", "5"]) == [4, 5]
dis.dis(main.__code__.co_consts[1])
"""
2 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (val)
8 LOAD_FAST 1 (val)
10 LOAD_CONST 0 (3)
12 COMPARE_OP 4 (>)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (val)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
"""
dis.dis(parse)
"""
2 0 LOAD_GLOBAL 0 (int)
2 LOAD_FAST 0 (string)
4 CALL_FUNCTION 1
6 RETURN_VALUE
"""
我认为你可以比你想象的更简单:
olds = ["1", "2", "3", "4", "5"]
news = [new for new in (parse(old) for old in olds) if new > 3]
或者只是:
news = [new for new in map(parse, olds) if new > 3]
这两种方式 parse
每个项目只调用一次。
假设您有一些昂贵的 cpu 密集型函数,例如解析 xml 字符串。在这种情况下,我们的简单函数将是:
def parse(foo):
return int(foo)
作为输入,您有一个字符串列表,您想要解析它们并找到满足某些条件的已解析字符串的子集。理想情况下,我们只想对每个字符串执行一次解析。
如果没有列表理解,您可以:
olds = ["1", "2", "3", "4", "5"]
news = []
for old in olds:
new = parse(old) # First and only Parse
if new > 3:
news.append(new)
要将此作为列表推导来执行,您似乎必须执行两次解析,一次获取新值,一次执行条件检查:
olds = ["1", "2", "3", "4", "5"]
news = [
parse(new) # First Parse
for new in olds
if parse(new) > 3 # Second Parse
]
例如,此语法将不起作用:
olds = ["1", "2", "3", "4", "5"]
# Raises SyntaxError: can't assign to function call
news = [i for parse(i) in olds if i > 5]
使用生成器似乎可行:
def parse(strings):
for string in strings:
yield int(string)
olds = ["1", "2", "3", "4", "5"]
news = [i for i in parse(olds) if i > 3]
但是你可以在生成器中抛出条件:
def parse(strings):
for string in strings:
val = int(string)
if val > 3:
yield val
olds = ["1", "2", "3", "4", "5"]
news = [i for i in parse(olds)]
我想知道的是,就优化(不是可重用性等)而言,哪个更好,解析发生在生成器中但条件检查发生在列表推导中,还是一个在生成器中同时进行解析和条件检查的地方?有没有比这两种方法更好的替代方法?
这是 Python 3.6.5 中 dis.dis
的一些输出。请注意,在我的 Python 版本中,为了反汇编列表理解,我们必须使用 f.__code__.co_consts[1]
。检查此
Generator 进行解析,List Comprehension 进行条件检查
def parse(strings):
for string in strings:
yield int(string)
def main(strings):
return [i for i in parse(strings) if i > 3]
assert main(["1", "2", "3", "4", "5"]) == [4, 5]
dis.dis(main.__code__.co_consts[1])
"""
2 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (3)
12 COMPARE_OP 4 (>)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
"""
dis.dis(parse)
"""
2 0 SETUP_LOOP 22 (to 24)
2 LOAD_FAST 0 (strings)
4 GET_ITER
>> 6 FOR_ITER 14 (to 22)
8 STORE_FAST 1 (string)
3 10 LOAD_GLOBAL 0 (int)
12 LOAD_FAST 1 (string)
14 CALL_FUNCTION 1
16 YIELD_VALUE
18 POP_TOP
20 JUMP_ABSOLUTE 6
>> 22 POP_BLOCK
>> 24 LOAD_CONST 0 (None)
26 RETURN_VALUE
"""
生成器进行解析和条件检查
def parse(strings):
for string in strings:
val = int(string)
if val > 3:
yield val
def main(strings):
return [i for i in parse(strings)]
assert main(["1", "2", "3", "4", "5"]) == [4, 5]
dis.dis(main.__code__.co_consts[1])
"""
2 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 8 (to 14)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LIST_APPEND 2
12 JUMP_ABSOLUTE 4
>> 14 RETURN_VALUE
"""
dis.dis(parse)
"""
2 0 SETUP_LOOP 34 (to 36)
2 LOAD_FAST 0 (strings)
4 GET_ITER
>> 6 FOR_ITER 26 (to 34)
8 STORE_FAST 1 (string)
3 10 LOAD_GLOBAL 0 (int)
12 LOAD_FAST 1 (string)
14 CALL_FUNCTION 1
16 STORE_FAST 2 (val)
4 18 LOAD_FAST 2 (val)
20 LOAD_CONST 1 (3)
22 COMPARE_OP 4 (>)
24 POP_JUMP_IF_FALSE 6
5 26 LOAD_FAST 2 (val)
28 YIELD_VALUE
30 POP_TOP
32 JUMP_ABSOLUTE 6
>> 34 POP_BLOCK
>> 36 LOAD_CONST 0 (None)
38 RETURN_VALUE
天真的紧环
def parse(string):
return int(string)
def main(strings):
values = []
for string in strings:
value = parse(string)
if value > 3:
values.append(value)
return values
assert main(["1", "2", "3", "4", "5"]) == [4, 5]
dis.dis(main)
"""
2 0 BUILD_LIST 0
2 STORE_FAST 1 (values)
3 4 SETUP_LOOP 38 (to 44)
6 LOAD_FAST 0 (strings)
8 GET_ITER
>> 10 FOR_ITER 30 (to 42)
12 STORE_FAST 2 (string)
4 14 LOAD_GLOBAL 0 (parse)
16 LOAD_FAST 2 (string)
18 CALL_FUNCTION 1
20 STORE_FAST 3 (value)
5 22 LOAD_FAST 3 (value)
24 LOAD_CONST 1 (3)
26 COMPARE_OP 4 (>)
28 POP_JUMP_IF_FALSE 10
6 30 LOAD_FAST 1 (values)
32 LOAD_ATTR 1 (append)
34 LOAD_FAST 3 (value)
36 CALL_FUNCTION 1
38 POP_TOP
40 JUMP_ABSOLUTE 10
>> 42 POP_BLOCK
7 >> 44 LOAD_FAST 1 (values)
46 RETURN_VALUE
"""
dis.dis(parse)
"""
2 0 LOAD_GLOBAL 0 (int)
2 LOAD_FAST 0 (string)
4 CALL_FUNCTION 1
6 RETURN_VALUE
"""
注意前两个使用列表推导和生成器的反汇编如何指示两个 for 循环,一个在主循环(列表推导)中,一个在解析(生成器)中。这并不像听起来那么糟糕,对吧?例如,整个操作是 O(n) 而不是 O(n^2) ?
编辑:这是 khelwood 的解决方案:
def parse(string):
return int(string)
def main(strings):
return [val for val in (parse(string) for string in strings) if val > 3]
assert main(["1", "2", "3", "4", "5"]) == [4, 5]
dis.dis(main.__code__.co_consts[1])
"""
2 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (val)
8 LOAD_FAST 1 (val)
10 LOAD_CONST 0 (3)
12 COMPARE_OP 4 (>)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (val)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
"""
dis.dis(parse)
"""
2 0 LOAD_GLOBAL 0 (int)
2 LOAD_FAST 0 (string)
4 CALL_FUNCTION 1
6 RETURN_VALUE
"""
我认为你可以比你想象的更简单:
olds = ["1", "2", "3", "4", "5"]
news = [new for new in (parse(old) for old in olds) if new > 3]
或者只是:
news = [new for new in map(parse, olds) if new > 3]
这两种方式 parse
每个项目只调用一次。