Python 正则表达式 findall 交替行为
Python regex findall alternation behavior
我正在使用 Python 2.7.6。我无法理解 re.findall
的以下结果:
>>> re.findall('\d|\(\d,\d\)', '(6,7)')
['(6,7)']
我预计以上为 return ['6', '7']
,因为根据 documentation:
'|'
A|B, where A and B can be arbitrary REs, creates a regular
expression that will match either A or B. An arbitrary number of REs
can be separated by the '|' in this way. This can be used inside
groups (see below) as well. As the target string is scanned, REs
separated by '|' are tried from left to right. When one pattern
completely matches, that branch is accepted. This means that once A
matches, B will not be tested further, even if it would produce a
longer overall match. In other words, the '|' operator is never
greedy. To match a literal '|', use \|, or enclose it inside a
character class, as in [|].
感谢您的帮助
如文档中所述:
This means that once A matches, B will not be tested further, even if it would produce a longer overall match.
所以在这种情况下正则表达式引擎不匹配 \d
因为你的字符串用 (
而不是 \d
所以它将匹配第二种情况 \(\d,\d\)
.但是,如果您的字符串以 \d
开头,它将匹配 \d
:
>>> re.findall('\d|\d,\d\)', '6,7)')
['6', '7']
我正在使用 Python 2.7.6。我无法理解 re.findall
的以下结果:
>>> re.findall('\d|\(\d,\d\)', '(6,7)')
['(6,7)']
我预计以上为 return ['6', '7']
,因为根据 documentation:
'|'
A|B, where A and B can be arbitrary REs, creates a regular expression that will match either A or B. An arbitrary number of REs can be separated by the '|' in this way. This can be used inside groups (see below) as well. As the target string is scanned, REs separated by '|' are tried from left to right. When one pattern completely matches, that branch is accepted. This means that once A matches, B will not be tested further, even if it would produce a longer overall match. In other words, the '|' operator is never greedy. To match a literal '|', use \|, or enclose it inside a character class, as in [|].
感谢您的帮助
如文档中所述:
This means that once A matches, B will not be tested further, even if it would produce a longer overall match.
所以在这种情况下正则表达式引擎不匹配 \d
因为你的字符串用 (
而不是 \d
所以它将匹配第二种情况 \(\d,\d\)
.但是,如果您的字符串以 \d
开头,它将匹配 \d
:
>>> re.findall('\d|\d,\d\)', '6,7)')
['6', '7']