如何根据值阈值过滤和删除字典元素?
How to filter and remove dict elements based on a value threshold?
我在一个列表中有几个具有这种结构的词典:
[{'store': 'walmart',
'store_id': 0,
'store_info': {'grapes': {'availability': {'No': 1, 'Yes': 1}},
'tomatoes': {'availability': {'No': 5, 'Yes': 6}},
'oranges': {'availability': {'No': 2, 'Yes': 2}},
'bottled water': {'availability': {'No': 10, 'Yes': 5}},
"india's mangos": {'availability': {'No': 3, 'Yes': 5}},
'water melon': {'availability': {'No': 2, 'Yes': 2}},
'lemons': {'availability': {'No': 2, 'Yes': 3}},
'kiwifruit': {'availability': {'No': 4, 'Yes': 2}},
'pineapple': {'availability': {'No': 5, 'Yes': 20}},
'total_yes': 23,
'total_no': 23,
'total': 46,
'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]}},
{'store': 'Costco',
'store_id': 24,
'store_info': {'papaya': {'availability': {'No': 1, 'Yes': 1}},
'lychee': {'availability': {'No': 5, 'Yes': 1}},
'fig': {'availability': {'No': 2, 'Yes': 2}},
'blackberry': {'availability': {'No': 2, 'Yes': 5}},
"india's mangos": {'availability': {'No': 3, 'Yes': 5}},
'plum': {'availability': {'No': 1, 'Yes': 2}},
'total_yes': 43,
'total_no': 3,
'total': 46,
'id': [3, 4, 36, 2, 1, 1, 2, 4, 2]}}
]
如何同时过滤所有大于等于5的Yes和No值?例如,给定上面的字典。如果字典满足条件,预期的输出应该是这样的:
[
{'store': 'walmart',
'store_id': 0,
'store_info': {
'tomatoes': {'availability': {'No': 5, 'Yes': 6}},
'bottled water': {'availability': {'No': 10, 'Yes': 5}},
'pineapple': {'availability': {'No': 5, 'Yes': 20}},
'total_yes': 23,
'total_no': 23,
'total': 46,
'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]}
}
]
在上面的示例中,应该过滤或删除 'india's mangos': {'availability': {'No': 3, 'Yes': 5}}
。因为,虽然 5 个 fullfil Yes 阈值,但键 No,并没有同时满足阈值。或者,'pineapple': {'availability': {'No': 5, 'Yes': 20}}
应该保留在字典中,因为 Yes
键的值为 20,大于阈值 5。最后,应该删除第二个字典 (costco),因为它的键 none 至少有 5.
到目前为止,我尝试遍历结构,但是,我循环太多了,有没有更紧凑的方法来获得预期的输出?:
a_lis = []
for e in list_dict:
try:
l = list(e['store_info'].keys())
for i in l:
#print(e['store_info'][i]['availability'])
if e['store_info'][i]['availability']['No']>=5 and e['availability'][i]['availability']['Yes']>= 5:
a_lis.append(e['store_info'][i]['availability'])
print(a_lis)
else:
pass
except TypeError:
pass
那不是difficult.I会建议您创建一个新列表。(并直接修改字典。)
lst = [{'store': 'walmart',
'store_id': 0,
'store_info': {'grapes': {'availability': {'No': 1, 'Yes': 1}},
'tomatoes': {'availability': {'No': 5, 'Yes': 6}},
'oranges': {'availability': {'No': 2, 'Yes': 2}},
'bottled water': {'availability': {'No': 10, 'Yes': 5}},
'india\'s mangos': {'availability': {'No': 3, 'Yes': 5}},
'water melon': {'availability': {'No': 2, 'Yes': 2}},
'lemons': {'availability': {'No': 2, 'Yes': 3}},
'kiwifruit': {'availability': {'No': 4, 'Yes': 2}},
'pineapple': {'availability': {'No': 5, 'Yes': 20}},
'total_yes': 23,
'total_no': 23,
'total': 46,
'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]}},
{'store': 'Costco',
'store_id': 24,
'store_info': {
'papaya': {'availability': {'No': 1, 'Yes': 1}},
'lychee': {'availability': {'No': 5, 'Yes': 1}},
'fig': {'availability': {'No': 2, 'Yes': 2}},
'blackberry': {'availability': {'No': 2, 'Yes': 5}},
'india\'s mangos': {'availability': {'No': 3, 'Yes': 5}},
'plum': {'availability': {'No': 1, 'Yes': 2}},
'total_yes': 43,
'total_no': 3,
'total': 46,
'id': [3, 4, 36, 2, 1, 1, 2, 4, 2]}}
]
result_list = []
for sub_dict in lst:
if sub_dict['store_info']['total_yes'] >= 5 and sub_dict['store_info']['total_no'] >= 5:
result_list.append(sub_dict)
key_need_to_be_removed = [k for k, v in sub_dict['store_info'].items() if type(v) is dict and (v['availability']['Yes'] < 5 or v['availability']['No'] < 5)]
for k in key_need_to_be_removed: # remove the dict under dictionary['store_info']
del sub_dict['store_info'][k]
print(result_list)
结果:
[{
'store': 'walmart',
'store_id': 0,
'store_info': {
'tomatoes': {
'availability': {
'No': 5,
'Yes': 6
}
},
'bottled water': {
'availability': {
'No': 10,
'Yes': 5
}
},
'pineapple': {
'availability': {
'No': 5,
'Yes': 20
}
},
'total_yes': 23,
'total_no': 23,
'total': 46,
'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]
}
}]
这是另一种方法:
# where data is the input
filtered = []
for store in data:
avail_dict = {}
extra_dict = {}
for item, value in store['store_info'].items():
if isinstance(value, dict):
okay = value['availability'].get('No',0) >= 5 and value['availability'].get('Yes',0) >= 5
if okay:
avail_dict[item] = value
else:
extra_dict[item] = value
if avail_dict:
avail_dict.update(extra_dict)
new_store = dict(store)
new_store['store_info'] = avail_dict
filtered.append(new_store)
filtered
的结果(输入 data
不变):
[{'store': 'walmart',
'store_id': 0,
'store_info': {'tomatoes': {'availability': {'No': 5, 'Yes': 6}},
'bottled water': {'availability': {'No': 10, 'Yes': 5}},
'pineapple': {'availability': {'No': 5, 'Yes': 20}},
'total_yes': 23,
'total_no': 23,
'total': 46,
'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]}}]
我在一个列表中有几个具有这种结构的词典:
[{'store': 'walmart',
'store_id': 0,
'store_info': {'grapes': {'availability': {'No': 1, 'Yes': 1}},
'tomatoes': {'availability': {'No': 5, 'Yes': 6}},
'oranges': {'availability': {'No': 2, 'Yes': 2}},
'bottled water': {'availability': {'No': 10, 'Yes': 5}},
"india's mangos": {'availability': {'No': 3, 'Yes': 5}},
'water melon': {'availability': {'No': 2, 'Yes': 2}},
'lemons': {'availability': {'No': 2, 'Yes': 3}},
'kiwifruit': {'availability': {'No': 4, 'Yes': 2}},
'pineapple': {'availability': {'No': 5, 'Yes': 20}},
'total_yes': 23,
'total_no': 23,
'total': 46,
'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]}},
{'store': 'Costco',
'store_id': 24,
'store_info': {'papaya': {'availability': {'No': 1, 'Yes': 1}},
'lychee': {'availability': {'No': 5, 'Yes': 1}},
'fig': {'availability': {'No': 2, 'Yes': 2}},
'blackberry': {'availability': {'No': 2, 'Yes': 5}},
"india's mangos": {'availability': {'No': 3, 'Yes': 5}},
'plum': {'availability': {'No': 1, 'Yes': 2}},
'total_yes': 43,
'total_no': 3,
'total': 46,
'id': [3, 4, 36, 2, 1, 1, 2, 4, 2]}}
]
如何同时过滤所有大于等于5的Yes和No值?例如,给定上面的字典。如果字典满足条件,预期的输出应该是这样的:
[
{'store': 'walmart',
'store_id': 0,
'store_info': {
'tomatoes': {'availability': {'No': 5, 'Yes': 6}},
'bottled water': {'availability': {'No': 10, 'Yes': 5}},
'pineapple': {'availability': {'No': 5, 'Yes': 20}},
'total_yes': 23,
'total_no': 23,
'total': 46,
'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]}
}
]
在上面的示例中,应该过滤或删除 'india's mangos': {'availability': {'No': 3, 'Yes': 5}}
。因为,虽然 5 个 fullfil Yes 阈值,但键 No,并没有同时满足阈值。或者,'pineapple': {'availability': {'No': 5, 'Yes': 20}}
应该保留在字典中,因为 Yes
键的值为 20,大于阈值 5。最后,应该删除第二个字典 (costco),因为它的键 none 至少有 5.
到目前为止,我尝试遍历结构,但是,我循环太多了,有没有更紧凑的方法来获得预期的输出?:
a_lis = []
for e in list_dict:
try:
l = list(e['store_info'].keys())
for i in l:
#print(e['store_info'][i]['availability'])
if e['store_info'][i]['availability']['No']>=5 and e['availability'][i]['availability']['Yes']>= 5:
a_lis.append(e['store_info'][i]['availability'])
print(a_lis)
else:
pass
except TypeError:
pass
那不是difficult.I会建议您创建一个新列表。(并直接修改字典。)
lst = [{'store': 'walmart',
'store_id': 0,
'store_info': {'grapes': {'availability': {'No': 1, 'Yes': 1}},
'tomatoes': {'availability': {'No': 5, 'Yes': 6}},
'oranges': {'availability': {'No': 2, 'Yes': 2}},
'bottled water': {'availability': {'No': 10, 'Yes': 5}},
'india\'s mangos': {'availability': {'No': 3, 'Yes': 5}},
'water melon': {'availability': {'No': 2, 'Yes': 2}},
'lemons': {'availability': {'No': 2, 'Yes': 3}},
'kiwifruit': {'availability': {'No': 4, 'Yes': 2}},
'pineapple': {'availability': {'No': 5, 'Yes': 20}},
'total_yes': 23,
'total_no': 23,
'total': 46,
'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]}},
{'store': 'Costco',
'store_id': 24,
'store_info': {
'papaya': {'availability': {'No': 1, 'Yes': 1}},
'lychee': {'availability': {'No': 5, 'Yes': 1}},
'fig': {'availability': {'No': 2, 'Yes': 2}},
'blackberry': {'availability': {'No': 2, 'Yes': 5}},
'india\'s mangos': {'availability': {'No': 3, 'Yes': 5}},
'plum': {'availability': {'No': 1, 'Yes': 2}},
'total_yes': 43,
'total_no': 3,
'total': 46,
'id': [3, 4, 36, 2, 1, 1, 2, 4, 2]}}
]
result_list = []
for sub_dict in lst:
if sub_dict['store_info']['total_yes'] >= 5 and sub_dict['store_info']['total_no'] >= 5:
result_list.append(sub_dict)
key_need_to_be_removed = [k for k, v in sub_dict['store_info'].items() if type(v) is dict and (v['availability']['Yes'] < 5 or v['availability']['No'] < 5)]
for k in key_need_to_be_removed: # remove the dict under dictionary['store_info']
del sub_dict['store_info'][k]
print(result_list)
结果:
[{
'store': 'walmart',
'store_id': 0,
'store_info': {
'tomatoes': {
'availability': {
'No': 5,
'Yes': 6
}
},
'bottled water': {
'availability': {
'No': 10,
'Yes': 5
}
},
'pineapple': {
'availability': {
'No': 5,
'Yes': 20
}
},
'total_yes': 23,
'total_no': 23,
'total': 46,
'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]
}
}]
这是另一种方法:
# where data is the input
filtered = []
for store in data:
avail_dict = {}
extra_dict = {}
for item, value in store['store_info'].items():
if isinstance(value, dict):
okay = value['availability'].get('No',0) >= 5 and value['availability'].get('Yes',0) >= 5
if okay:
avail_dict[item] = value
else:
extra_dict[item] = value
if avail_dict:
avail_dict.update(extra_dict)
new_store = dict(store)
new_store['store_info'] = avail_dict
filtered.append(new_store)
filtered
的结果(输入 data
不变):
[{'store': 'walmart',
'store_id': 0,
'store_info': {'tomatoes': {'availability': {'No': 5, 'Yes': 6}},
'bottled water': {'availability': {'No': 10, 'Yes': 5}},
'pineapple': {'availability': {'No': 5, 'Yes': 20}},
'total_yes': 23,
'total_no': 23,
'total': 46,
'id': [3, 4, 6, 2, 1, 6, 1, 4, 2]}}]