Java 等价于 PCRE/etc 的正则表达式。 shorthand`\K`?

Java Regular Expressions equivalent to PCRE/etc. shorthand `\K`?

Perl RegEx 和 PCRE(Perl-Compatible RegEx)等具有 shorthand \K 丢弃除捕获组之外的所有匹配项,但 Java 没有不支持,那Java相当于什么?

没有直接等价物。但是,您始终可以使用 捕获组.

重写此类模式

如果您仔细研究 \K 运算符及其局限性,您会发现您可以将此模式替换为 捕获组

rexegg.com \K reference:

In the middle of a pattern, \K says "reset the beginning of the reported match to this point". Anything that was matched before the \K goes unreported, a bit like in a lookbehind.

The key difference between \K and a lookbehind is that in PCRE, a lookbehind does not allow you to use quantifiers: the length of what you look for must be fixed. On the other hand, \K can be dropped anywhere in a pattern, so you are free to have any quantifiers you like before the \K.

但是,所有这些都意味着 \K 之前的模式仍然是 消耗模式 ,即正则表达式引擎添加将匹配的文本上移到匹配值,并在匹配模式时 提高其索引 ,而 \K 仅从匹配中删除匹配的文本,保持索引在原处。这意味着 \K 并不比捕获组好。

因此,value\s*=\s*\K\d+ PCRE/Onigmo 模式将转换为此 Java 代码:

String s = "Min value = 5000 km";
Matcher m = Pattern.compile("value\s*=\s*(\d+)").matcher(s);
if(m.find()) {
    System.out.println(m.group(1));
}

有一个替代方案,但只能与更小、更简单的设备一起使用 模式。 A constrained width lookbehind:

Java accepts quantifiers within lookbehind, as long as the length of the matching strings falls within a pre-determined range. For instance, (?<=cats?) is valid because it can only match strings of three or four characters. Likewise, (?<=A{1,10}) is valid.

所以,这也行得通:

    m = Pattern.compile("(?<=value\s{0,10}=\s{0,10})\d+").matcher(s);
    if(m.find()) {
        System.out.println(m.group());
    }
    

参见Java demo