正则表达式 - 在与版本号第一次匹配后停止
Regex - stop after first match with version numbers
我有一个试图匹配版本号的正则表达式,但是它产生了很多误报。
(\d{1,3}).*?(\d{1,3}).*?(\d{1,3})
是我目前所拥有的,它匹配任何包含 3 个部分和 2 个点的内容。
1.2.333
11.2.3
但是它不匹配包含 2 个部分和 1 个点的内容,
1.2
它也太贪心了,所以一行有多个点和部分,例如 11.22.33 。 44.55.66.77会匹配两次.
我正在寻找一个涵盖所有场景的正则表达式,
1.2
1.2.3
并且只匹配 1.2.3.4.5.6.7.8
的第一个实例
编辑:
我认为 ^\d{1,3}(?:\.\d{1,3})(?:\.\d{1,3})?
将尽我所能覆盖我目前想要的大部分内容
它仍然没有挑出一长串的前 3 部分,我会继续尝试
您想要的正则表达式是:
(\d{1,3}(?:\.\d{1,3}){1,2})(?:\.\d{1,3})*
包含最后一个子表达式 (?:\.\d{1,3})*
以消耗输入的其余部分,如果不包含此子表达式,则在 findall
扫描恢复时会导致匹配,如本例所示1.2.3 1.2.3.4.5.6.7.8
.
import re
s = 'abc 1.2 1.2.3 1.2.3.4.5.6.7.8'
print(re.findall(r'(\d{1,3}(?:\.\d{1,3}){1,2})(?:\.\d{1,3})*', s))
打印:
['1.2', '1.2.3', '1.2.3']
或者,您可以使用负面回顾:
((?<!\.)\d{1,3}(?:\.\d{1,3}){1,2})
import re
s = 'abc 1.2 1.2.3 1.2.3.4.5.6.7.8'
print(re.findall(r'((?<!\.)\d{1,3}(?:\.\d{1,3}){1,2})', s))
打印:
['1.2', '1.2.3', '1.2.3']
如果您使用的是 search
而不是 findall
,则匹配将作为第 1 组返回。
使用
(?m)^.*?\b(\d{1,3}\.\d{1,3}(?:\.\d{1,3})?)\b
见proof。
Python代码:
re.findall(r'(?m)^.*?\b(\d{1,3}\.\d{1,3}(?:\.\d{1,3})?)\b', string)
说明
--------------------------------------------------------------------------------
(?m) set flags for this block (with ^ and $
matching start and end of line) (case-
sensitive) (with . not matching \n)
(matching whitespace and # normally)
--------------------------------------------------------------------------------
^ the beginning of a "line"
--------------------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
( group and capture to :
--------------------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
--------------------------------------------------------------------------------
)? end of grouping
--------------------------------------------------------------------------------
) end of
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
我有一个试图匹配版本号的正则表达式,但是它产生了很多误报。
(\d{1,3}).*?(\d{1,3}).*?(\d{1,3})
是我目前所拥有的,它匹配任何包含 3 个部分和 2 个点的内容。
1.2.333
11.2.3
但是它不匹配包含 2 个部分和 1 个点的内容,
1.2
它也太贪心了,所以一行有多个点和部分,例如 11.22.33 。 44.55.66.77会匹配两次.
我正在寻找一个涵盖所有场景的正则表达式,
1.2
1.2.3
并且只匹配 1.2.3.4.5.6.7.8
编辑:
我认为 ^\d{1,3}(?:\.\d{1,3})(?:\.\d{1,3})?
将尽我所能覆盖我目前想要的大部分内容
它仍然没有挑出一长串的前 3 部分,我会继续尝试
您想要的正则表达式是:
(\d{1,3}(?:\.\d{1,3}){1,2})(?:\.\d{1,3})*
包含最后一个子表达式 (?:\.\d{1,3})*
以消耗输入的其余部分,如果不包含此子表达式,则在 findall
扫描恢复时会导致匹配,如本例所示1.2.3 1.2.3.4.5.6.7.8
.
import re
s = 'abc 1.2 1.2.3 1.2.3.4.5.6.7.8'
print(re.findall(r'(\d{1,3}(?:\.\d{1,3}){1,2})(?:\.\d{1,3})*', s))
打印:
['1.2', '1.2.3', '1.2.3']
或者,您可以使用负面回顾:
((?<!\.)\d{1,3}(?:\.\d{1,3}){1,2})
import re
s = 'abc 1.2 1.2.3 1.2.3.4.5.6.7.8'
print(re.findall(r'((?<!\.)\d{1,3}(?:\.\d{1,3}){1,2})', s))
打印:
['1.2', '1.2.3', '1.2.3']
如果您使用的是 search
而不是 findall
,则匹配将作为第 1 组返回。
使用
(?m)^.*?\b(\d{1,3}\.\d{1,3}(?:\.\d{1,3})?)\b
见proof。
Python代码:
re.findall(r'(?m)^.*?\b(\d{1,3}\.\d{1,3}(?:\.\d{1,3})?)\b', string)
说明
--------------------------------------------------------------------------------
(?m) set flags for this block (with ^ and $
matching start and end of line) (case-
sensitive) (with . not matching \n)
(matching whitespace and # normally)
--------------------------------------------------------------------------------
^ the beginning of a "line"
--------------------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
( group and capture to :
--------------------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
--------------------------------------------------------------------------------
)? end of grouping
--------------------------------------------------------------------------------
) end of
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char