如何匹配多个捕获组,但结果不如预期
How to match several capturing groups, but results not as expected
我正在尝试学习 Java 正则表达式。我想将几个捕获组(即 j(a(va))
)与另一个字符串(即 this is java. this is ava, this is va
)进行匹配。我期望输出为:
I found the text "java" starting at index 8 and ending at index 12.
I found the text "ava" starting at index 21 and ending at index 24.
I found the text "va" starting at index 34 and ending at index 36.
Number of group: 2
然而,IDE反而只输出:
I found the text "java" starting at index 8 and ending at index 12.
Number of group: 2
为什么会这样?有什么我想念的吗?
原代码:
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
System.out.println("\nEnter your regex:");
Pattern pattern
= Pattern.compile(br.readLine());
System.out.println("\nEnter input string to search:");
Matcher matcher
= pattern.matcher(br.readLine());
boolean found = false;
while (matcher.find()) {
System.out.format("I found the text"
+ " \"%s\" starting at "
+ "index %d and ending at index %d.%n",
matcher.group(),
matcher.start(),
matcher.end());
found = true;
System.out.println("Number of group: " + matcher.groupCount());
}
if (!found) {
System.out.println("No match found.");
}
在运行上面的代码之后,我输入了以下内容:
Enter your regex:
j(a(va))
Enter input string to search:
this is java. this is ava, this is va
IDE 输出:
I found the text "java" starting at index 8 and ending at index 12.
Number of group: 2
您的正则表达式只匹配整个字符串 java
,不匹配 ava
或 va
。当它匹配 java
时,它会将捕获组 1 设置为 ava
并将捕获组 2 设置为 va
,但它不会自己匹配这些字符串。将产生您想要的结果的正则表达式是:
j?(a?(va))
?
使前面的项目成为可选的,因此它将匹配后面没有这些前缀的项目。
你需要正则表达式 (j?(a?(va)))
Pattern p = Pattern.compile("(j?(a?(va)))");
Matcher m = p.matcher("this is java. this is ava, this is va");
while( m.find() )
{
String group = m.group();
int start = m.start();
int end = m.end();
System.out.format("I found the text"
+ " \"%s\" starting at "
+ "index %d and ending at index %d.%n",
group,
start,
end);
}
你可以看到演示here
我正在尝试学习 Java 正则表达式。我想将几个捕获组(即 j(a(va))
)与另一个字符串(即 this is java. this is ava, this is va
)进行匹配。我期望输出为:
I found the text "java" starting at index 8 and ending at index 12.
I found the text "ava" starting at index 21 and ending at index 24.
I found the text "va" starting at index 34 and ending at index 36.
Number of group: 2
然而,IDE反而只输出:
I found the text "java" starting at index 8 and ending at index 12.
Number of group: 2
为什么会这样?有什么我想念的吗?
原代码:
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
System.out.println("\nEnter your regex:");
Pattern pattern
= Pattern.compile(br.readLine());
System.out.println("\nEnter input string to search:");
Matcher matcher
= pattern.matcher(br.readLine());
boolean found = false;
while (matcher.find()) {
System.out.format("I found the text"
+ " \"%s\" starting at "
+ "index %d and ending at index %d.%n",
matcher.group(),
matcher.start(),
matcher.end());
found = true;
System.out.println("Number of group: " + matcher.groupCount());
}
if (!found) {
System.out.println("No match found.");
}
在运行上面的代码之后,我输入了以下内容:
Enter your regex:
j(a(va))
Enter input string to search:
this is java. this is ava, this is va
IDE 输出:
I found the text "java" starting at index 8 and ending at index 12.
Number of group: 2
您的正则表达式只匹配整个字符串 java
,不匹配 ava
或 va
。当它匹配 java
时,它会将捕获组 1 设置为 ava
并将捕获组 2 设置为 va
,但它不会自己匹配这些字符串。将产生您想要的结果的正则表达式是:
j?(a?(va))
?
使前面的项目成为可选的,因此它将匹配后面没有这些前缀的项目。
你需要正则表达式 (j?(a?(va)))
Pattern p = Pattern.compile("(j?(a?(va)))");
Matcher m = p.matcher("this is java. this is ava, this is va");
while( m.find() )
{
String group = m.group();
int start = m.start();
int end = m.end();
System.out.format("I found the text"
+ " \"%s\" starting at "
+ "index %d and ending at index %d.%n",
group,
start,
end);
}
你可以看到演示here