用于捕获特定数字的正则表达式

RegEx for capturing particular digits

从下面的日志中,我如何单独 grep '951792' 值

2019 May 22 03:32:17.952296 france1v4 sh[4937]: 190522-03:32:17.951792 [mod=REC, lvl=INFO] [tid=26130] Recording A8602096210405800406L200218680503121519 size is 4145956224 bytes
2019 May 22 03:32:17.952387 france1v4 sh[4937]: 190522-03:32:17.951895 [mod=REC, lvl=INFO] [tid=26130] RecordingInfo = fffocap://0x401e
2019 May 22 03:32:17.952466 france1v4 sh[4937]: 190522-03:32:17.951934 [mod=REC, lvl=INFO] [tid=26130] recording_dvr_from_recording_info:physicalSegmentCount=10   

我尝试了 java split/substring 操作。但是代码行很高。使用正则表达式如何获得“951792”值

输出将是

951792
951895
951934 
075041

您可以尝试以下正则表达式:

(?<=[0-9]{6}-[0-9]{2}:[0-9]{2}:[0-9]{2}\.)[0-9]+

在添加 java 代码时,不要忘记对 . (\.) 进行双重转义。

输入:

2019 May 22 03:32:17.952296 france1v4 sh[4937]: 190522-03:32:17.951792 [mod=REC, lvl=INFO] [tid=26130] Recording A8602096210405800406L200218680503121519 size is 4145956224 bytes
2019 May 22 03:32:17.952387 france1v4 sh[4937]: 190522-03:32:17.951895 [mod=REC, lvl=INFO] [tid=26130] RecordingInfo = fffocap://0x401e
2019 May 22 03:32:17.952466 france1v4 sh[4937]: 190522-03:32:17.951934 [mod=REC, lvl=INFO] [tid=26130] recording_dvr_from_recording_info:physicalSegmentCount=10 

匹配项:

951792
951895
951934 

Demo 1

对于同时使用前瞻和后视的更严格的正则表达式,请使用:

(?<=[0-9]\]:\s[0-9]{6}-[0-9]{2}:[0-9]{2}:[0-9]{2}\.)[0-9]+(?=\s\[mod=REC)

Demo 2

java 代码示例:

String input = "2019 May 22 03:32:17.952296 france1v4 sh[4937]: 190522-03:32:17.951792 [mod=REC, lvl=INFO] [tid=26130] Recording A8602096210405800406L200218680503121519 size is 4145956224 bytes\n" + 
                "2019 May 22 03:32:17.952387 france1v4 sh[4937]: 190522-03:32:17.951895 [mod=REC, lvl=INFO] [tid=26130] RecordingInfo = fffocap://0x401e\n" + 
                "2019 May 22 03:32:17.952466 france1v4 sh[4937]: 190522-03:32:17.951934 [mod=REC, lvl=INFO] [tid=26130] recording_dvr_from_recording_info:physicalSegmentCount=10   ";
List<String> matches = new ArrayList<String>();
Matcher m = Pattern.compile("(?<=[0-9]{6}-[0-9]{2}:[0-9]{2}:[0-9]{2}\.)[0-9]+")
.matcher(input);
while (m.find()) {
    matches.add(m.group());
}
System.out.println(matches);

代码输出:

[951792, 951895, 951934]

//逐行迭代循环。

String line = "2019 May 22 03:32:17.952296 france1v4 sh[4937]: 190522-03:32:17.951792 [mod=REC, lvl=INFO] [tid=26130] Recording A8602096210405800406L200218680503121519 size is 4145956224 bytes";
      String pattern = "^.+\.(\d+)";

      // Create a Pattern object
      Pattern r = Pattern.compile(pattern);

      // Now create matcher object.
      Matcher m = r.matcher(line);
      if (m.find( )) {
          System.out.println("Found value: " + m.group(1) ); //This would give 951792
              }else {
         System.out.println("NO MATCH");
      }

在此处获取正则表达式参考:https://regex101.com/r/8F0D4w/1

在这里,我们可能想简单地使用我们想要的数字旁边的右边界 [mod 并在我们的第一个捕获组中收集数字,可能与此类似:

([0-9]+)\s\[m 

如果我们愿意,我们可以添加更多的边界,例如:

(.+?)([0-9]+)\s\[m.+

DEMO

测试

import java.util.regex.Matcher;
import java.util.regex.Pattern;

final String regex = "(.+?)([0-9]+)\s\[m.+";
final String string = "2019 May 22 03:32:17.952296 france1v4 sh[4937]: 190522-03:32:17.951792 [mod=REC, lvl=INFO] [tid=26130] Recording A8602096210405800406L200218680503121519 size is 4145956224 bytes\n"
     + "2019 May 22 03:32:17.952387 france1v4 sh[4937]: 190522-03:32:17.951895 [mod=REC, lvl=INFO] [tid=26130] RecordingInfo = fffocap://0x401e\n"
     + "2019 May 22 03:32:17.952466 france1v4 sh[4937]: 190522-03:32:17.951934 [mod=REC, lvl=INFO] [tid=26130] recording_dvr_from_recording_info:physicalSegmentCount=10   \n";
final String subst = "\2";

final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);

// The substituted value will be contained in the result variable
final String result = matcher.replaceAll(subst);

System.out.println("Substitution result: " + result);

演示

const regex = /(.+?)([0-9]+)\s\[m.+/gm;
const str = `2019 May 22 03:32:17.952296 france1v4 sh[4937]: 190522-03:32:17.951792 [mod=REC, lvl=INFO] [tid=26130] Recording A8602096210405800406L200218680503121519 size is 4145956224 bytes
2019 May 22 03:32:17.952387 france1v4 sh[4937]: 190522-03:32:17.951895 [mod=REC, lvl=INFO] [tid=26130] RecordingInfo = fffocap://0x401e
2019 May 22 03:32:17.952466 france1v4 sh[4937]: 190522-03:32:17.951934 [mod=REC, lvl=INFO] [tid=26130] recording_dvr_from_recording_info:physicalSegmentCount=10   
`;
const subst = ``;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

正则表达式

如果不需要此表达式,可以在 regex101.com 中对其进行修改或更改。

正则表达式电路

jex.im 可视化正则表达式: