使用正则表达式解析 log4j 日志文件

parsing log4j log file using regular expression

我创建了一个 java 应用程序来使用正则表达式解析 log4j 日志文件,该应用程序对我在下面显示的日志工作正常

1999-11-27 15:49:37,459 [thread-x] ERROR mypackage - Catastrophic system failure

但不适用于

2015-01-22 01:52:54,237 [http-bio-80-exec-5] FATAL   TestLog4jServlet - Show FATAL message

下面给出了我的 log4j ConversionPattern

log4j.appender.Appender2.layout.ConversionPattern=%d [%t] %-7p %10c{1} - %m%n

谁能告诉我一些解决方案

我的代码如下

public static void main(String[] args) {
    String regex = "(\d{4}-\d{2}-\d{2}) (\d{2}:\d{2}:\d{2},\d{3}) \[(.*)\] ([^ ]*) ([^ ]*) - (.*)$";

    Pattern p = Pattern.compile(regex);
    String[] samples = {
            "2015-01-22 01:52:54,237 [http-bio-80-exec-5] FATAL   TestLog4jServlet - Show FATAL message"
        };

    Matcher m = p.matcher(samples[1]);
    System.out.println(m.matches());
    if (m.matches() && m.groupCount() == 6) {
        String date = m.group(1);
        String time = m.group(2);
        String threadId = m.group(3);
        String priority = m.group(4);
        String category = m.group(5);
        String message = m.group(6);

        System.out.println("date: " + date);
        System.out.println("time: " + time);
        System.out.println("threadId: " + threadId);
        System.out.println("priority: " + priority);
        System.out.println("category: " + category);
        System.out.println("message: " + message);
    }
}

因为 FATALTestLog4jServlet 之间有两个 space,但您在正则表达式中只包含一个 space。所以我建议你把对应的 space 替换成 <space>+ 这样就允许一个或多个 spaces.

(\d{4}-\d{2}-\d{2}) (\d{2}:\d{2}:\d{2},\d{3}) \[(.*?)\] ([^ ]*) +([^ ]*) - (.*)$
                                                                ^
                                                                |

DEMO

Java 正则表达式是,

"(\d{4}-\d{2}-\d{2}) (\d{2}:\d{2}:\d{2},\d{3}) \[(.*)\] ([^ ]*) +([^ ]*) - (.*)$"

我觉得Logstash更适合解析日志