Python 搜索两个词正则表达式
Python searching for two words regex
我正在尝试查找一个句子是否包含短语 "go * to",例如 "go over to"、"go up to" 等。我正在使用 Textblob,我知道我可以只需在下面使用:
search_go_to = set(["go", "to"])
go_to_blob = TextBlob(var)
matches = [str(s) for s in go_to_blob.sentences if search_go_to & set(s.words)]
print(matches)
但这也会 return 像 "go over there and bring this to him" 这样的句子,这是我不想要的。任何人都知道我如何做 text.find("go * to")?
这个有用吗?
import re
search_go_to = re.compile("^go.*to$")
go_to_blob = TextBlob(var)
matches = [str(s) for s in go_to_blob.sentences if search_go_to.match(str(s))]
print(matches)
正则表达式的解释:
^ beginning of line/string
go literal matching of "go"
.* zero or more characters of any kind
to literal matching of "to"
$ end of line/string
如果不希望"going to"匹配,在to
之前和go
之后插入一个\b
(单词边界)。
尝试使用:
for match in re.finditer(r"go\s+\w+\s+to", text, re.IGNORECASE):
使用generator expressions
>>> search_go_to = set(["go", "to"])
>>> m = ' .*? '.join(x for x in search_go_to)
>>> words = set(["go over to", "go up to", "foo bar"])
>>> matches = [s for s in words if re.search(m, s)]
>>> print(matches)
['go over to', 'go up to']
试试这个
text = "something go over to something"
if re.search("go\s+?\S+?\s+?to",text):
print "found"
else:
print "not found"
正则表达式:-
\s is for any space
\S is for any non space including special characters
+? is for no greedy approach (not required in OP's question)
所以 re.search("go\s+?\S+?\s+?to",text)
会匹配 "something go W#$%^^$ to something"
当然这也是 "something go over to something"
我正在尝试查找一个句子是否包含短语 "go * to",例如 "go over to"、"go up to" 等。我正在使用 Textblob,我知道我可以只需在下面使用:
search_go_to = set(["go", "to"])
go_to_blob = TextBlob(var)
matches = [str(s) for s in go_to_blob.sentences if search_go_to & set(s.words)]
print(matches)
但这也会 return 像 "go over there and bring this to him" 这样的句子,这是我不想要的。任何人都知道我如何做 text.find("go * to")?
这个有用吗?
import re
search_go_to = re.compile("^go.*to$")
go_to_blob = TextBlob(var)
matches = [str(s) for s in go_to_blob.sentences if search_go_to.match(str(s))]
print(matches)
正则表达式的解释:
^ beginning of line/string
go literal matching of "go"
.* zero or more characters of any kind
to literal matching of "to"
$ end of line/string
如果不希望"going to"匹配,在to
之前和go
之后插入一个\b
(单词边界)。
尝试使用:
for match in re.finditer(r"go\s+\w+\s+to", text, re.IGNORECASE):
使用generator expressions
>>> search_go_to = set(["go", "to"])
>>> m = ' .*? '.join(x for x in search_go_to)
>>> words = set(["go over to", "go up to", "foo bar"])
>>> matches = [s for s in words if re.search(m, s)]
>>> print(matches)
['go over to', 'go up to']
试试这个
text = "something go over to something"
if re.search("go\s+?\S+?\s+?to",text):
print "found"
else:
print "not found"
正则表达式:-
\s is for any space
\S is for any non space including special characters
+? is for no greedy approach (not required in OP's question)
所以 re.search("go\s+?\S+?\s+?to",text)
会匹配 "something go W#$%^^$ to something"
当然这也是 "something go over to something"