如何提取更长的字符串并忽略给定句子中的子字符串?
How to extract longer strings and ignore sub-strings from given sentence?
我有一个字符串列表和一个句子如下:
list_of_strings=["skin allergy","hair loss","allergy","hair", "skin"]
sentence="She experienced skin allergy and hair loss after using it for 2-3 weeks"
我想将 list_of_strings
匹配到 sentence
并将输出打印为仅较长的短语(忽略子字符串):
skin allergy
hair loss
我写了这个:但这会提取匹配的所有内容。
使用正则表达式。
例如:
import re
list_of_strings=["skin allergy","hair loss","allergy","hair", "skin"]
sentence="She experienced skin allergy and hair loss after using it for 2-3 weeks"
pattern = re.compile(r"(\b" + "|".join(list_of_strings) + r")\b")
m = pattern.findall(sentence)
print(m)
输出:
['skin allergy', 'hair loss']
我有一个字符串列表和一个句子如下:
list_of_strings=["skin allergy","hair loss","allergy","hair", "skin"]
sentence="She experienced skin allergy and hair loss after using it for 2-3 weeks"
我想将 list_of_strings
匹配到 sentence
并将输出打印为仅较长的短语(忽略子字符串):
skin allergy
hair loss
我写了这个
使用正则表达式。
例如:
import re
list_of_strings=["skin allergy","hair loss","allergy","hair", "skin"]
sentence="She experienced skin allergy and hair loss after using it for 2-3 weeks"
pattern = re.compile(r"(\b" + "|".join(list_of_strings) + r")\b")
m = pattern.findall(sentence)
print(m)
输出:
['skin allergy', 'hair loss']