词典处理列表 - 可读性和复杂性
list of dictionaries processing - readability and complexity
我有关于理解的基本问题。
有一个字典列表,其中值是列表,它看起来像这样:
listionary = [{'path': ['/tmp/folder/cat/number/letter', '/tmp/folder/hog/char/number/letter', '/tmp/folder/hog/number/letter', '/etc'],
'mask': True,
'name': 'dict-1'},
{'path': ['/tmp/folder/dog/number-2/letter-4', '/tmp/folder/hog-00/char/number-1/letter-5', '/tmp/folder/cow/number-2/letter-3'],
'mask': True,
'name': 'dict-2'},
{'path': ['/tmp/folder/dog_111/number/letter', '/tmp/folder/ant/char/number/letter', '/tmp/folder/hen/number/letter'],
'mask': True,
'name': 'dict-3'}]
我需要的是从列表类型值中获取每只独特的动物。
动物总是在 tmp/folder/ 和下一个 / 之间。
我做了什么:
import re
flat_list = [item for sublist in [i['path'] for i in listionary] for item in sublist]
animals = list(set([re.search('folder/([a-z]+)', elem).group(1) for elem in flat_list if 'tmp' in elem]))
也可以压缩成一行,但是比较复杂,看不懂:
animals = list(set([re.search('folder/([a-z]+)', elem).group(1) for elem in [item for sublist in [i['path'] for i in listionary] for item in sublist] if 'tmp' in elem]))
有没有关于理解力大小的黄金法则(例如 python 的禅宗)?
我怎样才能让它变得更好?提前谢谢你。
你可以试试这个:
listionary = [{'path': ['/tmp/folder/cat/number/letter', '/tmp/folder/hog/char/number/letter', '/tmp/folder/hog/number/letter', '/etc'],
'mask': True,
'name': 'dict-1'},
{'path': ['/tmp/folder/dog/number-2/letter-4', '/tmp/folder/hog-00/char/number-1/letter-5', '/tmp/folder/cow/number-2/letter-3'],
'mask': True,
'name': 'dict-2'},
{'path': ['/tmp/folder/dog_111/number/letter', '/tmp/folder/ant/char/number/letter', '/tmp/folder/hen/number/letter'],
'mask': True,
'name': 'dict-3'}]
import re
from itertools import chain
animals = list(set(chain.from_iterable([[re.findall("/tmp/folder/(.*?)/", b)[0] for b in i["path"] if re.findall("/tmp/folder/(.*?)/", b)] for i in listionary])))
输出:
['hog', 'hog-00', 'cow', 'dog_111', 'dog', 'cat', 'ant', 'hen']
您可以通过添加换行符和缩进使其更具可读性。我在 item for sublist...
这一行停了下来,因为我不明白代码逻辑,但大概你可以在那里添加更多的换行符。
animals = list(
set([
re.search('folder/([a-z]+)', elem).group(1) for elem in [
item for sublist in [i['path'] for i in listionary] for item in sublist
]
if 'tmp' in elem
])
)
也就是说,我认为这样的内容更具可读性:
def animal_name_from_path(path):
return re.search('folder/([a-z]+)', path).group(1)
def is_animal_path(path):
return '/tmp' in path
def deduplicate(L):
return list(set(L))
path_list = []
for item in listionary:
path_list.extend(item['path'])
animals = deduplicate([animal_name_from_path(path) for path in path_list if is_animal_path(path)])
这里应用的一个经验法则是任何概念都应该有一个名称。在您的原始代码中, item for sublist in [i['path'] for i in listionary] for item in sublist
很难理解,因为不清楚 item
和 i
应该是什么。在这个新块中,更清楚的是您只是在展平路径列表。一旦所有概念都被命名,动物名称识别代码就更容易理解。在这里,我可能已经把它发挥到了极致——你可以找到你认为最可读的快乐平衡。
简化的解决方案:
animals = set(re.search(r'/folder/([a-z]+)', p).group(1) for d in listionary for p in d['path'] if '/tmp' in p)
print(animals)
输出:
{'hog', 'cat', 'dog', 'cow', 'hen', 'ant'}
我怎样才能让它变得更好?
- 请其他人阅读。 ✓
- 使用函数封装更复杂的操作
- 不要在同一行嵌套循环
以下是我对最后两点的分解方式..
def get_animals(d):
animals = []
for item in d['path']:
if item.startswith('/tmp/folder/'):
animals.append(item[12:item.find('/',12)])
return animals
animals = set()
for d in dlist:
animals.update(get_animals(d))
animals = list(animals)
我有关于理解的基本问题。 有一个字典列表,其中值是列表,它看起来像这样:
listionary = [{'path': ['/tmp/folder/cat/number/letter', '/tmp/folder/hog/char/number/letter', '/tmp/folder/hog/number/letter', '/etc'],
'mask': True,
'name': 'dict-1'},
{'path': ['/tmp/folder/dog/number-2/letter-4', '/tmp/folder/hog-00/char/number-1/letter-5', '/tmp/folder/cow/number-2/letter-3'],
'mask': True,
'name': 'dict-2'},
{'path': ['/tmp/folder/dog_111/number/letter', '/tmp/folder/ant/char/number/letter', '/tmp/folder/hen/number/letter'],
'mask': True,
'name': 'dict-3'}]
我需要的是从列表类型值中获取每只独特的动物。 动物总是在 tmp/folder/ 和下一个 / 之间。 我做了什么:
import re
flat_list = [item for sublist in [i['path'] for i in listionary] for item in sublist]
animals = list(set([re.search('folder/([a-z]+)', elem).group(1) for elem in flat_list if 'tmp' in elem]))
也可以压缩成一行,但是比较复杂,看不懂:
animals = list(set([re.search('folder/([a-z]+)', elem).group(1) for elem in [item for sublist in [i['path'] for i in listionary] for item in sublist] if 'tmp' in elem]))
有没有关于理解力大小的黄金法则(例如 python 的禅宗)? 我怎样才能让它变得更好?提前谢谢你。
你可以试试这个:
listionary = [{'path': ['/tmp/folder/cat/number/letter', '/tmp/folder/hog/char/number/letter', '/tmp/folder/hog/number/letter', '/etc'],
'mask': True,
'name': 'dict-1'},
{'path': ['/tmp/folder/dog/number-2/letter-4', '/tmp/folder/hog-00/char/number-1/letter-5', '/tmp/folder/cow/number-2/letter-3'],
'mask': True,
'name': 'dict-2'},
{'path': ['/tmp/folder/dog_111/number/letter', '/tmp/folder/ant/char/number/letter', '/tmp/folder/hen/number/letter'],
'mask': True,
'name': 'dict-3'}]
import re
from itertools import chain
animals = list(set(chain.from_iterable([[re.findall("/tmp/folder/(.*?)/", b)[0] for b in i["path"] if re.findall("/tmp/folder/(.*?)/", b)] for i in listionary])))
输出:
['hog', 'hog-00', 'cow', 'dog_111', 'dog', 'cat', 'ant', 'hen']
您可以通过添加换行符和缩进使其更具可读性。我在 item for sublist...
这一行停了下来,因为我不明白代码逻辑,但大概你可以在那里添加更多的换行符。
animals = list(
set([
re.search('folder/([a-z]+)', elem).group(1) for elem in [
item for sublist in [i['path'] for i in listionary] for item in sublist
]
if 'tmp' in elem
])
)
也就是说,我认为这样的内容更具可读性:
def animal_name_from_path(path):
return re.search('folder/([a-z]+)', path).group(1)
def is_animal_path(path):
return '/tmp' in path
def deduplicate(L):
return list(set(L))
path_list = []
for item in listionary:
path_list.extend(item['path'])
animals = deduplicate([animal_name_from_path(path) for path in path_list if is_animal_path(path)])
这里应用的一个经验法则是任何概念都应该有一个名称。在您的原始代码中, item for sublist in [i['path'] for i in listionary] for item in sublist
很难理解,因为不清楚 item
和 i
应该是什么。在这个新块中,更清楚的是您只是在展平路径列表。一旦所有概念都被命名,动物名称识别代码就更容易理解。在这里,我可能已经把它发挥到了极致——你可以找到你认为最可读的快乐平衡。
简化的解决方案:
animals = set(re.search(r'/folder/([a-z]+)', p).group(1) for d in listionary for p in d['path'] if '/tmp' in p)
print(animals)
输出:
{'hog', 'cat', 'dog', 'cow', 'hen', 'ant'}
我怎样才能让它变得更好?
- 请其他人阅读。 ✓
- 使用函数封装更复杂的操作
- 不要在同一行嵌套循环
以下是我对最后两点的分解方式..
def get_animals(d):
animals = []
for item in d['path']:
if item.startswith('/tmp/folder/'):
animals.append(item[12:item.find('/',12)])
return animals
animals = set()
for d in dlist:
animals.update(get_animals(d))
animals = list(animals)