python 字典中的项目有条件删除(基于键值)
Conditional deletion (based on key value) of items from python dictionary
我已经使用 python 正则表达式在我的文本中识别了首字母缩略词,其中一些有 's at the end or a '。在他们的尽头。为了清理我的文本,我正在构建一本字典。我需要 '.'从首字母缩略词的末尾删除,从字典中完全删除任何常规英语单词,并删除首字母缩略词末尾出现的 's'。
输入字典:
{'ceos': 'CEOs', 'cis': 'CIS', 'ceo': 'CEO', 'cios': 'CIOs', 'cio.': 'CIO.', 'cio': 'CIO','info': 'INFO', 'update': 'UPDATE', 'additional': 'ADDITIONAL', '.': '.', 'kpis': 'KPIs'}
所需的输出字典:
{'ceos': 'CEO', 'cis': 'CIS', 'ceo': 'CEO', 'cios': 'CIO', 'cio.': 'CIO', 'cio': 'CIO', '.': '', 'kpis': 'KPI'}
我应该如何在 python 中编写代码来实现此目的?
没关系,我找到了一个很长的解决方案,但欢迎提出任何缩短它的建议:
from nltk.corpus import words
#only lower case of words work in words.words()
overall_dict_1=overall_dict.copy()
#remove . from key:value, any values with 's' or '.' modified to remove these and most of the english words removed from dictionary
for key, value in overall_dict.items():
#print(key)
if value[-1] in ['s','.']:
y=len(value)-1
overall_dict_1[key] = value[0:y]
if key=='.':
overall_dict_1.pop(key)
if not (key in ['ai','it','us','es','coo','lan','ea','aer','coe','eu','bot','sa','ma','roi','pa','dod','doe','cad','ope','soc','aum','mot','da','ae','ca','swot','iso','ba','sla','mou','dit','ist','wa','ram','wog','la','ad','os','sis','sow','lam','sop','bod','pst','ga','mo']):
if (key in words.words())== True:
overall_dict_1.pop(key)
我已经使用 python 正则表达式在我的文本中识别了首字母缩略词,其中一些有 's at the end or a '。在他们的尽头。为了清理我的文本,我正在构建一本字典。我需要 '.'从首字母缩略词的末尾删除,从字典中完全删除任何常规英语单词,并删除首字母缩略词末尾出现的 's'。
输入字典:
{'ceos': 'CEOs', 'cis': 'CIS', 'ceo': 'CEO', 'cios': 'CIOs', 'cio.': 'CIO.', 'cio': 'CIO','info': 'INFO', 'update': 'UPDATE', 'additional': 'ADDITIONAL', '.': '.', 'kpis': 'KPIs'}
所需的输出字典:
{'ceos': 'CEO', 'cis': 'CIS', 'ceo': 'CEO', 'cios': 'CIO', 'cio.': 'CIO', 'cio': 'CIO', '.': '', 'kpis': 'KPI'}
我应该如何在 python 中编写代码来实现此目的?
没关系,我找到了一个很长的解决方案,但欢迎提出任何缩短它的建议:
from nltk.corpus import words
#only lower case of words work in words.words()
overall_dict_1=overall_dict.copy()
#remove . from key:value, any values with 's' or '.' modified to remove these and most of the english words removed from dictionary
for key, value in overall_dict.items():
#print(key)
if value[-1] in ['s','.']:
y=len(value)-1
overall_dict_1[key] = value[0:y]
if key=='.':
overall_dict_1.pop(key)
if not (key in ['ai','it','us','es','coo','lan','ea','aer','coe','eu','bot','sa','ma','roi','pa','dod','doe','cad','ope','soc','aum','mot','da','ae','ca','swot','iso','ba','sla','mou','dit','ist','wa','ram','wog','la','ad','os','sis','sow','lam','sop','bod','pst','ga','mo']):
if (key in words.words())== True:
overall_dict_1.pop(key)