如何匹配多个捕获组,但结果不如预期

How to match several capturing groups, but results not as expected

我正在尝试学习 Java 正则表达式。我想将几个捕获组(即 j(a(va)))与另一个字符串(即 this is java. this is ava, this is va)进行匹配。我期望输出为:

I found the text "java" starting at index 8 and ending at index 12.
I found the text "ava" starting at index 21 and ending at index 24.    
I found the text "va" starting at index 34 and ending at index 36.
Number of group: 2

然而,IDE反而只输出:

I found the text "java" starting at index 8 and ending at index 12.
Number of group: 2

为什么会这样?有什么我想念的吗?

原代码:

BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
System.out.println("\nEnter your regex:");

        Pattern pattern
                = Pattern.compile(br.readLine());

        System.out.println("\nEnter input string to search:");
        Matcher matcher
                = pattern.matcher(br.readLine());

        boolean found = false;
        while (matcher.find()) {
            System.out.format("I found the text"
                    + " \"%s\" starting at "
                    + "index %d and ending at index %d.%n",
                    matcher.group(),
                    matcher.start(),
                    matcher.end());
            found = true;
            System.out.println("Number of group: " + matcher.groupCount());
        }
        if (!found) {
            System.out.println("No match found.");
        }

在运行上面的代码之后,我输入了以下内容:

Enter your regex:
j(a(va))

Enter input string to search:
this is java. this is ava, this is va

IDE 输出:

I found the text "java" starting at index 8 and ending at index 12.
Number of group: 2

您的正则表达式只匹配整个字符串 java,不匹配 avava。当它匹配 java 时,它会将捕获组 1 设置为 ava 并将捕获组 2 设置为 va,但它不会自己匹配这些字符串。将产生您想要的结果的正则表达式是:

j?(a?(va))

? 使前面的项目成为可选的,因此它将匹配后面没有这些前缀的项目。

DEMO

你需要正则表达式 (j?(a?(va)))

Pattern p = Pattern.compile("(j?(a?(va)))");
Matcher m = p.matcher("this is java. this is ava, this is va");

while( m.find() )
{
    String group = m.group();
    int start = m.start();
    int end = m.end();
    System.out.format("I found the text"
                  + " \"%s\" starting at "
                 + "index %d and ending at index %d.%n",
                  group,
                  start,
                   end);



}

你可以看到演示here