当字符串都存储在 python 的列表中时,如何检查字符串是否包含子字符串?
How to check if a string contains substring when both are stored in lists in python?
我的主字符串在数据框中,子字符串存储在列表中。我想要的输出是找到匹配的子字符串。这是我正在使用的代码。
sentence2 = "Previous study: 03/03/2018 (other hospital) Findings: Lung parenchyma: The study reveals evidence of apicoposterior segmentectomy of LUL showing soft tissue thickening adjacent surgical bed at LUL, possibly post operation."
blob_sentence = TextBlob(sentence2)
noun = blob_sentence.noun_phrases
df1 = pd.DataFrame(noun)
comorbidity_keywords = ["segmentectomy","lobectomy"]
matches =[]
for comorbidity_keywords[0] in df1:
if comorbidity_keywords[0] in df1 and comorbidity_keywords[0] not in matches:
matches.append(comorbidity_keywords)
这给我的结果是不是实际匹配的字符串。输出应该是 "segmentectomy"。但我得到 [0,'lobectomy']。请帮忙!!。我试图从此处发布的答案中获得帮助。 Check if multiple strings exist in another string请帮忙找出我做错了什么?
应该有一些更有效的方法来做到这一点。但这就是我对两个列表使用两个 for 循环的结果。
for ckeyword in comorbidity_keywords:
for keyword in df1.values.tolist():
if any(ckeyword in key for key in keyword):
matches.append(ckeyword)
我并没有真正使用 TextBlob,但我有两种方法可以帮助您实现目标。本质上,我将句子用空格分开并遍历它以查看是否有任何匹配项。一种方法 returns 列表,另一种方法是索引值和单词的字典。
### If you just want a list of words
def find_keyword_matches(sentence, keyword_list):
s1 = sentence.split(' ')
return [i for i in s1 if i in keyword_list]
然后:
find_keyword_matches(sentence2, comorbidity_keywords)
输出:
['segmentectomy']
对于字典:
def find_keyword_matches(sentence, keyword_list):
s1 = sentence.split(' ')
return {xyz.index(i):i for i in xyz if i in comorbidity_keywords}
输出:
{17: 'segmentectomy'}
最后,一个迭代器,它还将打印在句子中找到单词的位置(如果有的话):
def word_range(sentence, keyword):
try:
idx_start = sentence.index(keyword)
idx_end = idx_start + len(keyword)
print(f'Word \'{keyword}\' found within index range {idx_start} to {idx_end}')
if idx_start > 0:
return keyword
except ValueError:
pass
然后做一个嵌套的列表推导来去掉 None 个值:
found_words = [x for x in [word_range(sentence2, i) for i in comorbidity_keywords] if not x is None]
我的主字符串在数据框中,子字符串存储在列表中。我想要的输出是找到匹配的子字符串。这是我正在使用的代码。
sentence2 = "Previous study: 03/03/2018 (other hospital) Findings: Lung parenchyma: The study reveals evidence of apicoposterior segmentectomy of LUL showing soft tissue thickening adjacent surgical bed at LUL, possibly post operation."
blob_sentence = TextBlob(sentence2)
noun = blob_sentence.noun_phrases
df1 = pd.DataFrame(noun)
comorbidity_keywords = ["segmentectomy","lobectomy"]
matches =[]
for comorbidity_keywords[0] in df1:
if comorbidity_keywords[0] in df1 and comorbidity_keywords[0] not in matches:
matches.append(comorbidity_keywords)
这给我的结果是不是实际匹配的字符串。输出应该是 "segmentectomy"。但我得到 [0,'lobectomy']。请帮忙!!。我试图从此处发布的答案中获得帮助。 Check if multiple strings exist in another string请帮忙找出我做错了什么?
应该有一些更有效的方法来做到这一点。但这就是我对两个列表使用两个 for 循环的结果。
for ckeyword in comorbidity_keywords:
for keyword in df1.values.tolist():
if any(ckeyword in key for key in keyword):
matches.append(ckeyword)
我并没有真正使用 TextBlob,但我有两种方法可以帮助您实现目标。本质上,我将句子用空格分开并遍历它以查看是否有任何匹配项。一种方法 returns 列表,另一种方法是索引值和单词的字典。
### If you just want a list of words
def find_keyword_matches(sentence, keyword_list):
s1 = sentence.split(' ')
return [i for i in s1 if i in keyword_list]
然后:
find_keyword_matches(sentence2, comorbidity_keywords)
输出:
['segmentectomy']
对于字典:
def find_keyword_matches(sentence, keyword_list):
s1 = sentence.split(' ')
return {xyz.index(i):i for i in xyz if i in comorbidity_keywords}
输出:
{17: 'segmentectomy'}
最后,一个迭代器,它还将打印在句子中找到单词的位置(如果有的话):
def word_range(sentence, keyword):
try:
idx_start = sentence.index(keyword)
idx_end = idx_start + len(keyword)
print(f'Word \'{keyword}\' found within index range {idx_start} to {idx_end}')
if idx_start > 0:
return keyword
except ValueError:
pass
然后做一个嵌套的列表推导来去掉 None 个值:
found_words = [x for x in [word_range(sentence2, i) for i in comorbidity_keywords] if not x is None]