grep 的贪婪行为
Greedy behaviour of grep
我认为在正则表达式中,"greediness" 适用于量词而不是整体匹配。但是,我观察到
grep -E --color=auto 'a+(ab)?' <(printf "aab")
returns aab 而不是 aab.
同样适用于sed。
另一方面,在 pcregrep 和其他工具中,真正贪婪的是量词。
这是 grep 的特定行为吗?
N.B。我检查了两个
grep (BSD grep) 2.5.1-FreeBSD 和 grep (GNU grep) 3.1
在 the description of term matched 中,POSIX 表示
The search for a matching sequence starts at the beginning of a string and stops when the first sequence matching the expression is found, where "first" is defined to mean "begins earliest in the string". If the pattern permits a variable number of matching characters and thus there is more than one such sequence starting at that point, the longest such sequence is matched.
这句话清楚地回答了你的问题。字符串 aab
包含两个子字符串,它们从与 ERE a+(ab)?
匹配的相同位置开始;它们是 aa
和 aab
。后者最长,故匹配
我认为在正则表达式中,"greediness" 适用于量词而不是整体匹配。但是,我观察到
grep -E --color=auto 'a+(ab)?' <(printf "aab")
returns aab 而不是 aab.
同样适用于sed。 另一方面,在 pcregrep 和其他工具中,真正贪婪的是量词。 这是 grep 的特定行为吗?
N.B。我检查了两个 grep (BSD grep) 2.5.1-FreeBSD 和 grep (GNU grep) 3.1
在 the description of term matched 中,POSIX 表示
The search for a matching sequence starts at the beginning of a string and stops when the first sequence matching the expression is found, where "first" is defined to mean "begins earliest in the string". If the pattern permits a variable number of matching characters and thus there is more than one such sequence starting at that point, the longest such sequence is matched.
这句话清楚地回答了你的问题。字符串 aab
包含两个子字符串,它们从与 ERE a+(ab)?
匹配的相同位置开始;它们是 aa
和 aab
。后者最长,故匹配