如何在迭代三个匹配词列表后创建 Python 字典
How to Create a Python Dictionary after iterating over three Lists for matching words
我需要在迭代三个 LIST 后创建一个字典。
用于匹配句子(list_sent as KEYs)和单词列表(list_wordset as VALUEs)进行匹配
关键字(list_keywords)。请参阅下面的列表和预期输出字典及其说明。请建议。
list_sent = ['one more shock like Covid-19',
'The number of people suffering acute',
'people must collectively act now',
'handling the novel coronavirus outbreak',
'After a three-week nationwide',
'strengthening medical quarantine']
list_wordset = [['people','suffering','acute'],
['Covid-19','Corona','like'],
['people','jersy','country'],
['novel', 'coronavirus', 'outbreak']]
list_keywords = ['people', 'Covid-19', 'nationwide','quarantine','handling']
'Covid-19' 关键字出现在 list_sent 和 list_wordset 中,因此它们被捕获到字典中。
'people' 关键字出现在 list_sent 中的 2 个不同项目和 list_wordset 中的 2 个不同列表中,因此需要捕获它们。即使来自 list_wordset 的单个单词与关键字匹配,那么它也是一个匹配项。
预期输出为:
out_dict =
{'one more shock like Covid-19': ['Covid-19','Corona','like'],
'The number of people suffering acute': [['people','suffering','acute'],['people','jersy','country']],
'people must collectively act now' : [['people','suffering','acute'],['people','jersy','country']]}
>>> {sent: [
wordset for wordset in list_wordset if any(word in sent for word in wordset)
] for sent in list_sent}
{'one more shock like Covid-19': [['Covid-19', 'Corona', 'like']],
'The number of people suffering acute': [['people', 'suffering', 'acute'], ['people', 'jersy', 'country']],
'people must collectively act now': [['people', 'suffering', 'acute'], ['people', 'jersy', 'country']],
'handling the novel coronavirus outbreak': [['novel', 'coronavirus', 'outbreak']],
'After a three-week nationwide': [],
'strengthening medical quarantine': []}
我能够使用所有 3 个列表,以字典格式创建所需的输出。要删除空值,请使用额外的步骤。
out_dict = {sent: [wordset for wordset in list_wordset if any(key in sent and key in wordset for key in list_keywords)]
for sent in list_sent}
结果:
{'one more shock like Covid-19': [['Covid-19', 'Corona', 'like']],
'The number of people suffering acute': [['people', 'suffering', 'acute'],
['people', 'jersy', 'country']],
'people must collectively act now': [['people', 'suffering', 'acute'],
['people', 'jersy', 'country']],
'handling the novel coronavirus outbreak': [],
'After a three-week nationwide': [],
'strengthening medical quarantine': []}
删除空列表值:
out_dict = dict( [(k,v) for k,v in out_dict.items() if len(v)>0])
最终结果:
{'one more shock like Covid-19': [['Covid-19', 'Corona', 'like']],
'The number of people suffering acute': [['people', 'suffering', 'acute'],
['people', 'jersy', 'country']],
'people must collectively act now': [['people', 'suffering', 'acute'],
['people', 'jersy', 'country']]}
我需要在迭代三个 LIST 后创建一个字典。 用于匹配句子(list_sent as KEYs)和单词列表(list_wordset as VALUEs)进行匹配 关键字(list_keywords)。请参阅下面的列表和预期输出字典及其说明。请建议。
list_sent = ['one more shock like Covid-19',
'The number of people suffering acute',
'people must collectively act now',
'handling the novel coronavirus outbreak',
'After a three-week nationwide',
'strengthening medical quarantine']
list_wordset = [['people','suffering','acute'],
['Covid-19','Corona','like'],
['people','jersy','country'],
['novel', 'coronavirus', 'outbreak']]
list_keywords = ['people', 'Covid-19', 'nationwide','quarantine','handling']
'Covid-19' 关键字出现在 list_sent 和 list_wordset 中,因此它们被捕获到字典中。 'people' 关键字出现在 list_sent 中的 2 个不同项目和 list_wordset 中的 2 个不同列表中,因此需要捕获它们。即使来自 list_wordset 的单个单词与关键字匹配,那么它也是一个匹配项。
预期输出为:
out_dict =
{'one more shock like Covid-19': ['Covid-19','Corona','like'],
'The number of people suffering acute': [['people','suffering','acute'],['people','jersy','country']],
'people must collectively act now' : [['people','suffering','acute'],['people','jersy','country']]}
>>> {sent: [
wordset for wordset in list_wordset if any(word in sent for word in wordset)
] for sent in list_sent}
{'one more shock like Covid-19': [['Covid-19', 'Corona', 'like']],
'The number of people suffering acute': [['people', 'suffering', 'acute'], ['people', 'jersy', 'country']],
'people must collectively act now': [['people', 'suffering', 'acute'], ['people', 'jersy', 'country']],
'handling the novel coronavirus outbreak': [['novel', 'coronavirus', 'outbreak']],
'After a three-week nationwide': [],
'strengthening medical quarantine': []}
我能够使用所有 3 个列表,以字典格式创建所需的输出。要删除空值,请使用额外的步骤。
out_dict = {sent: [wordset for wordset in list_wordset if any(key in sent and key in wordset for key in list_keywords)]
for sent in list_sent}
结果:
{'one more shock like Covid-19': [['Covid-19', 'Corona', 'like']],
'The number of people suffering acute': [['people', 'suffering', 'acute'],
['people', 'jersy', 'country']],
'people must collectively act now': [['people', 'suffering', 'acute'],
['people', 'jersy', 'country']],
'handling the novel coronavirus outbreak': [],
'After a three-week nationwide': [],
'strengthening medical quarantine': []}
删除空列表值:
out_dict = dict( [(k,v) for k,v in out_dict.items() if len(v)>0])
最终结果:
{'one more shock like Covid-19': [['Covid-19', 'Corona', 'like']],
'The number of people suffering acute': [['people', 'suffering', 'acute'],
['people', 'jersy', 'country']],
'people must collectively act now': [['people', 'suffering', 'acute'],
['people', 'jersy', 'country']]}