使用包含字符串拆分操作的字典理解

Using a dictionary comprehension with an included string split operation

考虑一个微小的属性解析器片段:

testx="""var1 = foo
         var2 = bar"""

dd = { l.split('=')[0].strip():l.split('=')[1].strip() for l in testx.split('\n')} 
print(dd)
# {'var1': 'foo', 'var2': 'bar'}

行得通,但由于在 l.split('=')[0].strip():l.split('=')[1].strip() 中两次调用“拆分”,所以很难看 .如何将字典理解更改为只需要拆分一次,然后将字典条目构建为:

l[0].strip():l[1].strip()

该重构是否需要嵌套理解或构建单级理解的不同方式?

使用re.findall:

import re
testx="""var1 = foo
         var2 = bar"""

dct = dict(re.findall(r'(\S+)\s*=\s*(\S+)', testx))
print(dct)
# {'var1': 'foo', 'var2': 'bar'}

如果您使用的是 Python >= 3.8,这正是添加赋值表达式的原因:

>>> {(parts:=l.split('='))[0].strip(): parts[1].strip() for l in testx.split("\n")}
{'var1': 'foo', 'var2': 'bar'}

在此之前,您可以执行以下操作:

>>> {key.strip():value.strip() for l in testx.split('\n') for key, value in [l.split("=")]}
{'var1': 'foo', 'var2': 'bar'}

老实说,我发现它更具可读性。

但老实说,这些对我来说仍然很难读。说到底,我觉得你打不过:

>>> result = {}
>>> for l in testx.split("\n"):
...     key, value = l.split("=")
...     result[key.strip()] = value.strip()
...
>>> result
{'var1': 'foo', 'var2': 'bar'}

编辑

请注意,for <target list> in [<expression>] 习语实际上已在 Python 3.9 中 优化:

https://docs.python.org/3/whatsnew/3.9.html#optimizations

Optimized the idiom for assignment a temporary variable in comprehensions. Now for y in [expr] in comprehensions is as fast as a simple assignment y = expr. For example:

sums = [s for s in [0] for x in data for s in [s + x]]

Unlike the := operator this idiom does not leak a variable to the outer scope.

比较 Pyhton 3.8 和 Pyhton 3.9 中的字节码,你会发现 Python 3.9 版本中没有嵌套迭代:

Python 3.8:

Python 3.8.1 (default, Jan  8 2020, 16:15:59)
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> dis.dis('{k:v for l in "a b|c d".split("|") for k,v in [l.split()]}')
  1           0 LOAD_CONST               0 (<code object <dictcomp> at 0x7fdbd6249d40, file "<dis>", line 1>)
              2 LOAD_CONST               1 ('<dictcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_CONST               2 ('a b|c d')
              8 LOAD_METHOD              0 (split)
             10 LOAD_CONST               3 ('|')
             12 CALL_METHOD              1
             14 GET_ITER
             16 CALL_FUNCTION            1
             18 RETURN_VALUE

Disassembly of <code object <dictcomp> at 0x7fdbd6249d40, file "<dis>", line 1>:
  1           0 BUILD_MAP                0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                30 (to 36)
              6 STORE_FAST               1 (l)
              8 LOAD_FAST                1 (l)
             10 LOAD_METHOD              0 (split)
             12 CALL_METHOD              0
             14 BUILD_TUPLE              1
             16 GET_ITER
        >>   18 FOR_ITER                14 (to 34)
             20 UNPACK_SEQUENCE          2
             22 STORE_FAST               2 (k)
             24 STORE_FAST               3 (v)
             26 LOAD_FAST                2 (k)
             28 LOAD_FAST                3 (v)
             30 MAP_ADD                  3
             32 JUMP_ABSOLUTE           18
        >>   34 JUMP_ABSOLUTE            4
        >>   36 RETURN_VALUE

与Python 3.9:

Python 3.9.0 | packaged by conda-forge | (default, Oct 14 2020, 22:56:29)
[Clang 10.0.1 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> dis.dis('{k:v for l in "a b|c d".split("|") for k,v in [l.split()]}')
  1           0 LOAD_CONST               0 (<code object <dictcomp> at 0x7fb3587d1870, file "<dis>", line 1>)
              2 LOAD_CONST               1 ('<dictcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_CONST               2 ('a b|c d')
              8 LOAD_METHOD              0 (split)
             10 LOAD_CONST               3 ('|')
             12 CALL_METHOD              1
             14 GET_ITER
             16 CALL_FUNCTION            1
             18 RETURN_VALUE

Disassembly of <code object <dictcomp> at 0x7fb3587d1870, file "<dis>", line 1>:
  1           0 BUILD_MAP                0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                22 (to 28)
              6 STORE_FAST               1 (l)
              8 LOAD_FAST                1 (l)
             10 LOAD_METHOD              0 (split)
             12 CALL_METHOD              0
             14 UNPACK_SEQUENCE          2
             16 STORE_FAST               2 (k)
             18 STORE_FAST               3 (v)
             20 LOAD_FAST                2 (k)
             22 LOAD_FAST                3 (v)
             24 MAP_ADD                  2
             26 JUMP_ABSOLUTE            4
        >>   28 RETURN_VALUE