当字符串都存储在 python 的列表中时，如何检查字符串是否包含子字符串？

Question

我的主字符串在数据框中，子字符串存储在列表中。我想要的输出是找到匹配的子字符串。这是我正在使用的代码。

sentence2 = "Previous study: 03/03/2018 (other hospital)  Findings:   Lung parenchyma: The study reveals evidence of apicoposterior segmentectomy of LUL showing soft tissue thickening adjacent surgical bed at LUL, possibly post operation." 
blob_sentence = TextBlob(sentence2)
noun = blob_sentence.noun_phrases
df1 = pd.DataFrame(noun)
comorbidity_keywords = ["segmentectomy","lobectomy"]
matches =[]
for comorbidity_keywords[0] in df1:
    if comorbidity_keywords[0] in df1 and comorbidity_keywords[0] not in matches:
       matches.append(comorbidity_keywords)

这给我的结果是不是实际匹配的字符串。输出应该是 "segmentectomy"。但我得到 [0,'lobectomy']。请帮忙！！。我试图从此处发布的答案中获得帮助。 Check if multiple strings exist in another string请帮忙找出我做错了什么？

Answer 1

应该有一些更有效的方法来做到这一点。但这就是我对两个列表使用两个 for 循环的结果。

for ckeyword in comorbidity_keywords:
   for keyword in df1.values.tolist():
     if any(ckeyword in key for key in keyword):
        matches.append(ckeyword)

Answer 2

我并没有真正使用 TextBlob，但我有两种方法可以帮助您实现目标。本质上，我将句子用空格分开并遍历它以查看是否有任何匹配项。一种方法 returns 列表，另一种方法是索引值和单词的字典。

### If you just want a list of words
def find_keyword_matches(sentence, keyword_list):
    s1 = sentence.split(' ')
    return [i for i in  s1 if i in keyword_list]

然后：

find_keyword_matches(sentence2, comorbidity_keywords)

输出：

['segmentectomy']

对于字典：

def find_keyword_matches(sentence, keyword_list):
    s1 = sentence.split(' ')
    return {xyz.index(i):i for i in xyz if i in comorbidity_keywords}

输出：

{17: 'segmentectomy'}

最后，一个迭代器，它还将打印在句子中找到单词的位置（如果有的话）：

def word_range(sentence, keyword):
    try:
        idx_start = sentence.index(keyword)
        idx_end = idx_start + len(keyword)
        print(f'Word \'{keyword}\' found within index range {idx_start} to {idx_end}')
        if idx_start > 0:
            return keyword
    except ValueError:
        pass

然后做一个嵌套的列表推导来去掉 None 个值：

found_words = [x for x in [word_range(sentence2, i) for i in comorbidity_keywords] if not x is None]

当字符串都存储在 python 的列表中时，如何检查字符串是否包含子字符串？

How to check if a string contains substring when both are stored in lists in python?

python

string

pandas

textblob