re.search() 逻辑上或 re.search() 中的两个模式
re.search() logically or two patterns in re.search()
我有以下字符串。
Page load for http://xxxx?roxy=www.yahoo.com&sendto=https://mywebsite?ent took 4001 ms (Ne: 167 ms, Se: 2509 ms, Xe: 1325 ms)<br><br><br>Topic: Yahoo!! My website is a good website | Mywebsite<br>
或
Page load for http://xxxx?roxy=www.yahoo.com&sendto=https://mywebsite?ent took too long (12343 ms Ne: 167 ms, Se: 2509 ms, Xe: 1325 ms)<br><br><br>Topic: Yahoo!! My website is a good website | Mywebsite<br>
我想从 ent took 4001 ms
OR 中提取上面的 4001
OR 12343
ent took too long (12343 ms
并将其分配给变量
tt = int(re.search(r"\?ent\s*took\s*(\d+)",message).group(1))
这个正则表达式确实匹配第一部分并且 return me 4001.How do I logcially 或表达式 r"\?ent\s*\took\s*too\s*long\s*\((\d+)"
从第二部分中提取 12343?
正则表达式开头的问号后面没有任何可以设为可选的内容。如果你想在那里匹配文字问号,写 \?
:
x = int(re.findall(r"\?ent\s*took\s*([^m]*)",message)[0])
首先,您需要在模式的开头对 ?
进行转义,因为 ?
标记是一个正则表达式字符,并使字符串可选并且必须在字符串之前!所以如果你想计算 ?
你还需要使用 \?
作为一种更有效的方法你可以在你的模式中使用 re.search
和 \d+
,并拒绝额外的索引:
>>> int(re.search(r"\?ent\s*took\s*(\d+)",s).group(1))
4001
对于第二个例子,你可以这样做:
>>> re.search(r'\((\d+)',s).group(1)
'12343'
两种情况下的匹配都使用以下模式:
(\d+)[\s\w]+\(|\((\d+)
>>> s1="Page load for http://xxxx?roxy=www.yahoo.com&sendto=https://mywebsite?ent took too long (12343 ms Ne: 167 ms, Se: 2509 ms, Xe: 1325 ms)<br><br><br>Topic: Yahoo!! My website is a good website | Mywebsite<br>"
>>> s2="Page load for http://xxxx?roxy=www.yahoo.com&sendto=https://mywebsite?ent took 4001 ms (Ne: 167 ms, Se: 2509 ms, Xe: 1325 ms)<br><br><br>Topic: Yahoo!! My website is a good website | Mywebsite<br>"
>>> re.search(r'(\d+)[\s\w]+\(|\((\d+)',s1).group(2)
'12343'
>>> re.search(r'(\d+)[\s\w]+\(|\((\d+)',s2).group(1)
'4001'
这个一次匹配两个模式并提取所需的数字:
tt = int(re.search(r"\?ent took (too long \()?(?P<num>\d+)",message).group('num'))
我有以下字符串。
Page load for http://xxxx?roxy=www.yahoo.com&sendto=https://mywebsite?ent took 4001 ms (Ne: 167 ms, Se: 2509 ms, Xe: 1325 ms)<br><br><br>Topic: Yahoo!! My website is a good website | Mywebsite<br>
或
Page load for http://xxxx?roxy=www.yahoo.com&sendto=https://mywebsite?ent took too long (12343 ms Ne: 167 ms, Se: 2509 ms, Xe: 1325 ms)<br><br><br>Topic: Yahoo!! My website is a good website | Mywebsite<br>
我想从 ent took 4001 ms
OR 中提取上面的 4001
OR 12343
ent took too long (12343 ms
并将其分配给变量
tt = int(re.search(r"\?ent\s*took\s*(\d+)",message).group(1))
这个正则表达式确实匹配第一部分并且 return me 4001.How do I logcially 或表达式 r"\?ent\s*\took\s*too\s*long\s*\((\d+)"
从第二部分中提取 12343?
正则表达式开头的问号后面没有任何可以设为可选的内容。如果你想在那里匹配文字问号,写 \?
:
x = int(re.findall(r"\?ent\s*took\s*([^m]*)",message)[0])
首先,您需要在模式的开头对 ?
进行转义,因为 ?
标记是一个正则表达式字符,并使字符串可选并且必须在字符串之前!所以如果你想计算 ?
你还需要使用 \?
作为一种更有效的方法你可以在你的模式中使用 re.search
和 \d+
,并拒绝额外的索引:
>>> int(re.search(r"\?ent\s*took\s*(\d+)",s).group(1))
4001
对于第二个例子,你可以这样做:
>>> re.search(r'\((\d+)',s).group(1)
'12343'
两种情况下的匹配都使用以下模式:
(\d+)[\s\w]+\(|\((\d+)
>>> s1="Page load for http://xxxx?roxy=www.yahoo.com&sendto=https://mywebsite?ent took too long (12343 ms Ne: 167 ms, Se: 2509 ms, Xe: 1325 ms)<br><br><br>Topic: Yahoo!! My website is a good website | Mywebsite<br>"
>>> s2="Page load for http://xxxx?roxy=www.yahoo.com&sendto=https://mywebsite?ent took 4001 ms (Ne: 167 ms, Se: 2509 ms, Xe: 1325 ms)<br><br><br>Topic: Yahoo!! My website is a good website | Mywebsite<br>"
>>> re.search(r'(\d+)[\s\w]+\(|\((\d+)',s1).group(2)
'12343'
>>> re.search(r'(\d+)[\s\w]+\(|\((\d+)',s2).group(1)
'4001'
这个一次匹配两个模式并提取所需的数字:
tt = int(re.search(r"\?ent took (too long \()?(?P<num>\d+)",message).group('num'))