Java 等价于 PCRE/etc 的正则表达式。 shorthand`\K`?
Java Regular Expressions equivalent to PCRE/etc. shorthand `\K`?
Perl RegEx 和 PCRE(Perl-Compatible RegEx)等具有 shorthand \K
丢弃除捕获组之外的所有匹配项,但 Java 没有不支持,那Java相当于什么?
没有直接等价物。但是,您始终可以使用 捕获组.
重写此类模式
如果您仔细研究 \K
运算符及其局限性,您会发现您可以将此模式替换为 捕获组。
In the middle of a pattern, \K
says "reset the beginning of the reported match to this point". Anything that was matched before the \K
goes unreported, a bit like in a lookbehind.
The key difference between \K
and a lookbehind is that in PCRE, a lookbehind does not allow you to use quantifiers: the length of what you look for must be fixed. On the other hand, \K
can be dropped anywhere in a pattern, so you are free to have any quantifiers you like before the \K
.
但是,所有这些都意味着 \K
之前的模式仍然是 消耗模式 ,即正则表达式引擎添加将匹配的文本上移到匹配值,并在匹配模式时 提高其索引 ,而 \K
仅从匹配中删除匹配的文本,保持索引在原处。这意味着 \K
并不比捕获组好。
因此,value\s*=\s*\K\d+
PCRE/Onigmo 模式将转换为此 Java 代码:
String s = "Min value = 5000 km";
Matcher m = Pattern.compile("value\s*=\s*(\d+)").matcher(s);
if(m.find()) {
System.out.println(m.group(1));
}
有一个替代方案,但只能与更小、更简单的设备一起使用
模式。 A constrained width lookbehind:
Java accepts quantifiers within lookbehind, as long as the length of the matching strings falls within a pre-determined range. For instance, (?<=cats?)
is valid because it can only match strings of three or four characters. Likewise, (?<=A{1,10})
is valid.
所以,这也行得通:
m = Pattern.compile("(?<=value\s{0,10}=\s{0,10})\d+").matcher(s);
if(m.find()) {
System.out.println(m.group());
}
参见Java demo。
Perl RegEx 和 PCRE(Perl-Compatible RegEx)等具有 shorthand \K
丢弃除捕获组之外的所有匹配项,但 Java 没有不支持,那Java相当于什么?
没有直接等价物。但是,您始终可以使用 捕获组.
重写此类模式如果您仔细研究 \K
运算符及其局限性,您会发现您可以将此模式替换为 捕获组。
In the middle of a pattern,
\K
says "reset the beginning of the reported match to this point". Anything that was matched before the\K
goes unreported, a bit like in a lookbehind.The key difference between
\K
and a lookbehind is that in PCRE, a lookbehind does not allow you to use quantifiers: the length of what you look for must be fixed. On the other hand,\K
can be dropped anywhere in a pattern, so you are free to have any quantifiers you like before the\K
.
但是,所有这些都意味着 \K
之前的模式仍然是 消耗模式 ,即正则表达式引擎添加将匹配的文本上移到匹配值,并在匹配模式时 提高其索引 ,而 \K
仅从匹配中删除匹配的文本,保持索引在原处。这意味着 \K
并不比捕获组好。
因此,value\s*=\s*\K\d+
PCRE/Onigmo 模式将转换为此 Java 代码:
String s = "Min value = 5000 km";
Matcher m = Pattern.compile("value\s*=\s*(\d+)").matcher(s);
if(m.find()) {
System.out.println(m.group(1));
}
有一个替代方案,但只能与更小、更简单的设备一起使用 模式。 A constrained width lookbehind:
Java accepts quantifiers within lookbehind, as long as the length of the matching strings falls within a pre-determined range. For instance,
(?<=cats?)
is valid because it can only match strings of three or four characters. Likewise,(?<=A{1,10})
is valid.
所以,这也行得通:
m = Pattern.compile("(?<=value\s{0,10}=\s{0,10})\d+").matcher(s);
if(m.find()) {
System.out.println(m.group());
}
参见Java demo。