如何提取更长的字符串并忽略给定句子中的子字符串?

How to extract longer strings and ignore sub-strings from given sentence?

我有一个字符串列表和一个句子如下:

list_of_strings=["skin allergy","hair loss","allergy","hair", "skin"]

sentence="She experienced skin allergy and hair loss after using it for 2-3 weeks"

我想将 list_of_strings 匹配到 sentence 并将输出打印为仅较长的短语(忽略子字符串):

skin allergy
hair loss

我写了这个:但这会提取匹配的所有内容。

使用正则表达式。

例如:

import re

list_of_strings=["skin allergy","hair loss","allergy","hair", "skin"]
sentence="She experienced skin allergy and hair loss after using it for 2-3 weeks"
pattern = re.compile(r"(\b" + "|".join(list_of_strings) + r")\b")

m = pattern.findall(sentence)
print(m)

输出:

['skin allergy', 'hair loss']