Python for循环保存包含某个值的键和值
Python for loop to save keys and values containing a certain value
假设我有一个 python 列表和字典结构,如下所示:
[ {'href': 'https://www.simplyrecipes.com/recipes/cuisine/portuguese/'},
{'href': 'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/'},
{'href': 'https://www.simplyrecipes.com/recipes/type/condiment/'},
{'href': 'https://www.simplyrecipes.com/recipes/ingredient/adobado/'}]
我正在寻找最有效的方法
(i) 仅循环遍历 = 'href'
的键和值包含 ''https://www.simplyrecipes.com/recipes/'
的 'href'
键并识别值 ('http...'
)包含 'recipes/cuisine'
、'recipes/season'
和 'recipes/ingredient'
(ii) 将每个完整的 url 值保存到单独的列表中(取决于它们满足哪些 'recipe/...'
条件)并命名为适当的。
预期结果:
cuisine = ['https://www.simplyrecipes.com/recipes/cuisine/portuguese/']
season = ['https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/']
type = ['https://www.simplyrecipes.com/recipes/type/condiment/']
ingredient = ['https://www.simplyrecipes.com/recipes/ingredient/adobado/']
任何不符合上述条件的键和/或值都将被跳过。
任何指针将不胜感激。
假设 url 在所附问题中遵循相同的格式。更好的方法是创建不同食谱的字典。
In [50]: from collections import defaultdict
In [51]: sep_data = defaultdict(list)
In [52]: lst = [ {'href': 'https://www.simplyrecipes.com/recipes/cuisine/portuguese/'},
...: {'href': 'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/'},
...: {'href': 'https://www.simplyrecipes.com/recipes/type/condiment/'},
...: {'href': 'https://www.simplyrecipes.com/recipes/ingredient/adobado/'}]
In [59]: for i in lst: sep_data[i["href"].split("/")[-3]].append(i["href"])
In [60]: sep_data
Out[60]:
defaultdict(list,
{'cuisine': ['https://www.simplyrecipes.com/recipes/cuisine/portuguese/'],
'season': ['https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/'],
'type': ['https://www.simplyrecipes.com/recipes/type/condiment/'],
'ingredient': ['https://www.simplyrecipes.com/recipes/ingredient/adobado/']})
这是一个简单的方法,希望对您有所帮助
import re
trash = [ {'href': 'https://www.simplyrecipes.com/recipes/cuisine/portuguese/'},
{'href': 'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/'},
{'href': 'https://www.simplyrecipes.com/recipes/type/condiment/'},
{'href': 'https://www.simplyrecipes.com/recipes/ingredient/adobado/'}]
for x in trash:
for y in x.values():
txt = ''
for i in re.findall("recipes/.*", y):
txt += i
title = txt.split('/')[1]
print({title: y})
输出
{'cuisine': 'https://www.simplyrecipes.com/recipes/cuisine/portuguese/'}
{'season': 'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/'}
{'type': 'https://www.simplyrecipes.com/recipes/type/condiment/'}
{'ingredient': 'https://www.simplyrecipes.com/recipes/ingredient/adobado/'}
所以,大致
from itertools import groupby
import re
lst = [ {'href': 'https://www.simplyrecipes.com/recipes/cuisine/portuguese/'},
{'href': 'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/'},
{'href': 'https://www.simplyrecipes.com/recipes/type/condiment/'},
{'href': 'https://www.simplyrecipes.com/recipes/ingredient/adobado/'}]
def f(i):
x = re.findall("https://www.simplyrecipes.com/recipes/([^/ ]+)/(?:[^/ ]+/?)+", i["href"])
return x and x[0] or None
r = filter(lambda i: i[0] in ('cuisine', 'season', 'ingredient'), groupby(lst, f))
for i in r:
print(f"{i[0]} = {list(map(lambda j: j['href'], i[1]))}")
# result:
# cuisine = ['https://www.simplyrecipes.com/recipes/cuisine/portuguese/']
# season = ['https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/']
# ingredient = ['https://www.simplyrecipes.com/recipes/ingredient/adobado/']
假设我有一个 python 列表和字典结构,如下所示:
[ {'href': 'https://www.simplyrecipes.com/recipes/cuisine/portuguese/'},
{'href': 'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/'},
{'href': 'https://www.simplyrecipes.com/recipes/type/condiment/'},
{'href': 'https://www.simplyrecipes.com/recipes/ingredient/adobado/'}]
我正在寻找最有效的方法
(i) 仅循环遍历 = 'href'
的键和值包含 ''https://www.simplyrecipes.com/recipes/'
的 'href'
键并识别值 ('http...'
)包含 'recipes/cuisine'
、'recipes/season'
和 'recipes/ingredient'
(ii) 将每个完整的 url 值保存到单独的列表中(取决于它们满足哪些 'recipe/...'
条件)并命名为适当的。
预期结果:
cuisine = ['https://www.simplyrecipes.com/recipes/cuisine/portuguese/']
season = ['https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/']
type = ['https://www.simplyrecipes.com/recipes/type/condiment/']
ingredient = ['https://www.simplyrecipes.com/recipes/ingredient/adobado/']
任何不符合上述条件的键和/或值都将被跳过。
任何指针将不胜感激。
假设 url 在所附问题中遵循相同的格式。更好的方法是创建不同食谱的字典。
In [50]: from collections import defaultdict
In [51]: sep_data = defaultdict(list)
In [52]: lst = [ {'href': 'https://www.simplyrecipes.com/recipes/cuisine/portuguese/'},
...: {'href': 'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/'},
...: {'href': 'https://www.simplyrecipes.com/recipes/type/condiment/'},
...: {'href': 'https://www.simplyrecipes.com/recipes/ingredient/adobado/'}]
In [59]: for i in lst: sep_data[i["href"].split("/")[-3]].append(i["href"])
In [60]: sep_data
Out[60]:
defaultdict(list,
{'cuisine': ['https://www.simplyrecipes.com/recipes/cuisine/portuguese/'],
'season': ['https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/'],
'type': ['https://www.simplyrecipes.com/recipes/type/condiment/'],
'ingredient': ['https://www.simplyrecipes.com/recipes/ingredient/adobado/']})
这是一个简单的方法,希望对您有所帮助
import re
trash = [ {'href': 'https://www.simplyrecipes.com/recipes/cuisine/portuguese/'},
{'href': 'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/'},
{'href': 'https://www.simplyrecipes.com/recipes/type/condiment/'},
{'href': 'https://www.simplyrecipes.com/recipes/ingredient/adobado/'}]
for x in trash:
for y in x.values():
txt = ''
for i in re.findall("recipes/.*", y):
txt += i
title = txt.split('/')[1]
print({title: y})
输出
{'cuisine': 'https://www.simplyrecipes.com/recipes/cuisine/portuguese/'}
{'season': 'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/'}
{'type': 'https://www.simplyrecipes.com/recipes/type/condiment/'}
{'ingredient': 'https://www.simplyrecipes.com/recipes/ingredient/adobado/'}
所以,大致
from itertools import groupby
import re
lst = [ {'href': 'https://www.simplyrecipes.com/recipes/cuisine/portuguese/'},
{'href': 'https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/'},
{'href': 'https://www.simplyrecipes.com/recipes/type/condiment/'},
{'href': 'https://www.simplyrecipes.com/recipes/ingredient/adobado/'}]
def f(i):
x = re.findall("https://www.simplyrecipes.com/recipes/([^/ ]+)/(?:[^/ ]+/?)+", i["href"])
return x and x[0] or None
r = filter(lambda i: i[0] in ('cuisine', 'season', 'ingredient'), groupby(lst, f))
for i in r:
print(f"{i[0]} = {list(map(lambda j: j['href'], i[1]))}")
# result:
# cuisine = ['https://www.simplyrecipes.com/recipes/cuisine/portuguese/']
# season = ['https://www.simplyrecipes.com/recipes/season/seasonal_favorites_spring/']
# ingredient = ['https://www.simplyrecipes.com/recipes/ingredient/adobado/']