Logstash 不会处理跨越多行的日志

Logstash will not process logs that span multiple lines

我正在尝试解析一些本地日志文件,我正在 运行 在我的 windows 机器上安装 ELK Stack。这是我尝试解析的日志示例。

2015-12-10 13:50:25,487 [http-nio-8080-exec-26] INFO  a.b.c.v1.myTestClass [abcde-1234-12345-b425-12ad]- This Message is OK
2015-12-10 13:50:26,487 [http-nio-8080-exec-26] INFO  a.b.c.v1.myTestClass [abcde-1234-12345-b425-12ad]- Journe
y road update: <rows>
     <row adi="D" date="2015-12-10" garage="TOP">
          <codeNum order="1">TP</codeNum>
          <number order="1">1001</number>
          <journeystatus code="RT">OnRoute</journeystatus>
     </row>
</rows>

第一条消息通过过滤器工作正常,但第二条消息被分成多条消息,标签部分带有 _grokparsefailure

Logstash 配置文件

input {
    file {
        path => "C:/data/sampleLogs/temp.log"
        type => "testlog"
        start_position => "beginning"
    }
}

filter {
    grok {
        # Parse timestamp data. We need the "(?m)" so that grok (Oniguruma internally) correctly parses multi-line events
        match => [ "message", [
            "(?m)%{TIMESTAMP_ISO8601:logTimestamp}[ ;]\[%{DATA:threadId}\][ ;]%{LOGLEVEL:logLevel}[ ;]+%{JAVACLASS:JavaClass}[ ;]%{SYSLOG5424SD:TransactionID}[ ;]*%{GREEDYDATA:LogMessage}",
            "(?m)%{TIMESTAMP_ISO8601:logTimestamp}[ ;]\[%{DATA:threadId}\][ ;]%{LOGLEVEL:logLevel}[ ;]+%{JAVAFILE:JavaClass}[ ;]%{SYSLOG5424SD:TransactionID}[ ;]*%{GREEDYDATA:LogMessage}"
            ]
        ]
    }
    # The timestamp may have commas instead of dots. Convert so as to store everything in the same way
    mutate {
        gsub => [
            # replace all commas with dots
            "logTimestamp", ",", "."
            ]
    }

    mutate {
        gsub => [
            # make the logTimestamp sortable. With a space, it is not! This does not work that well, in the end
            # but somehow apparently makes things easier for the date filter
            "logTimestamp", " ", ";"
            ]
    }

    date {
        locale => "en"
        timezone => "UTC"
        match => [ "logTimestamp", "YYYY-MM-dd;HH:mm:ss.SSS" ]
        target => "@timestamp"
    }

    mutate {
        add_field => { "debug-timestamp" => "timestampMatched"}
    }
}

output {
    stdout {
        codec => rubydebug
    }   
}

当我运行

bin\logstash agent -f \ELK-Stack\logstash\conf_input.conf

CMD提示返回的内容如下

io/console not supported; tty will not be manipulated
Default settings used: Filter workers: 4
Logstash startup completed
{
            "message" => "     <row adi=\"D\" date=\"2015-12-10\" garage=\"TOP\"
>\r",
           "@version" => "1",
         "@timestamp" => "2015-12-11T12:49:34.268Z",
               "host" => "GMAN",
               "path" => "C:/data/sampleLogs/temp.log",
               "type" => "testlog",
               "tags" => [
        [0] "_grokparsefailure"
    ],
    "debug-timestamp" => "timestampMatched"
}
{
            "message" => "          <codeNum order=\"1\">TP</codeNum>\r",
           "@version" => "1",
         "@timestamp" => "2015-12-11T12:49:34.268Z",
               "host" => "GMAN",
               "path" => "C:/data/sampleLogs/temp.log",
               "type" => "testlog",
               "tags" => [
        [0] "_grokparsefailure"
    ],
    "debug-timestamp" => "timestampMatched"
}
{
            "message" => "          <number order=\"1\">1001</number>\r",
           "@version" => "1",
         "@timestamp" => "2015-12-11T12:49:34.268Z",
               "host" => "GMAN",
               "path" => "C:/data/sampleLogs/temp.log",
               "type" => "testlog",
               "tags" => [
        [0] "_grokparsefailure"
    ],
    "debug-timestamp" => "timestampMatched"
}
{
            "message" => "          <journeystatus code=\"RT\">OnRoute</journeys
tatus>\r",
           "@version" => "1",
         "@timestamp" => "2015-12-11T12:49:34.278Z",
               "host" => "GMAN",
               "path" => "C:/data/sampleLogs/temp.log",
               "type" => "testlog",
               "tags" => [
        [0] "_grokparsefailure"
    ],
    "debug-timestamp" => "timestampMatched"
}
{
            "message" => "     </row>\r",
           "@version" => "1",
         "@timestamp" => "2015-12-11T12:49:34.278Z",
               "host" => "GMAN",
               "path" => "C:/data/sampleLogs/temp.log",
               "type" => "testlog",
               "tags" => [
        [0] "_grokparsefailure"
    ],
    "debug-timestamp" => "timestampMatched"
}
{
            "message" => "y road update: <rows>\r",
           "@version" => "1",
         "@timestamp" => "2015-12-11T12:49:34.268Z",
               "host" => "GMAN",
               "path" => "C:/data/sampleLogs/temp.log",
               "type" => "testlog",
               "tags" => [
        [0] "_grokparsefailure"
    ],
    "debug-timestamp" => "timestampMatched"
}
{
            "message" => "2015-12-10 13:50:25,487 [http-nio-8080-exec-26] INFO
a.b.c.v1.myTestClass [abcde-1234-12345-b425-12ad]- Journe\r",
           "@version" => "1",
         "@timestamp" => "2015-12-10T13:50:25.487Z",
               "host" => "GMAN",
               "path" => "C:/data/sampleLogs/temp.log",
               "type" => "testlog",
       "logTimestamp" => "2015-12-10;13:50:25.487",
           "threadId" => "http-nio-8080-exec-26",
           "logLevel" => "INFO",
          "JavaClass" => "a.b.c.v1.myTestClass",
      "TransactionID" => "[abcde-1234-12345-b425-12ad]",
         "LogMessage" => "- Journe\r",
    "debug-timestamp" => "timestampMatched"
}
{
            "message" => "</rows>2015-12-10 13:50:25,487 [http-nio-8080-exec-26]
 INFO  a.b.c.v1.myTestClass [abcde-1234-12345-b425-12ad]- This Message is OK\r",

           "@version" => "1",
         "@timestamp" => "2015-12-10T13:50:25.487Z",
               "host" => "GMAN",
               "path" => "C:/data/sampleLogs/temp.log",
               "type" => "testlog",
       "logTimestamp" => "2015-12-10;13:50:25.487",
           "threadId" => "http-nio-8080-exec-26",
           "logLevel" => "INFO",
          "JavaClass" => "a.b.c.v1.myTestClass",
      "TransactionID" => "[abcde-1234-12345-b425-12ad]",
         "LogMessage" => "- This Message is OK\r",
    "debug-timestamp" => "timestampMatched"
}

我确实在我的过滤器顶部添加了多行,但它没有工作,只是在 grok 之后出现了以下错误。

multiline {
        pattern => "^201*-**-**- **:**:"
        what => "previous"
        negate=> true
    }

但这无济于事,只是不断给我一条错误消息,例如

Error: Cannot use more than 1 filter worker because the following plugins don't
work with more than one worker: multiline
You may be interested in the '--configtest' flag which you can
use to validate logstash's configuration before you choose
to restart a running system.

所以我尝试 运行按照建议设置 --configtest,但出现了一条新的错误消息

Error: Cannot use more than 1 filter worker because the following plugins don't
work with more than one worker: multiline

谁能帮我解决这个问题并让 logstash 处理多行。

非常感谢您的帮助

更新

正如@Alain Collins 建议将编解码器与 multiline 一起使用,这是我的配置输入的样子。

input {
    file {
        path => "C:/data/sampleLogs/mulline.log"
        codec => multiline {
            # Grok pattern names are valid! :)
            pattern => "^%{TIMESTAMP_ISO8601} "
            negate => true
            what => previous
        }
        type => "testlog"
        start_position => "beginning"
    }
}

G

您找到了正确的解决方案 - 多行。这些台词需要合并成一个事件。

正如您发现的那样,多行过滤器不是线程安全的,因此您只能 运行 该 logstash 中的一个工作线程。

有一个 multiline codec 可能适合您。它将行组合为输入{} 阶段的一部分,并将一个事件传递到过滤器{} 阶段。

请注意,您可以将 logstash pattens 与多行一起使用,因此“^%{YEAR}”比“^201”更好。

最后,关注filebeat,它是logstash-forwarder的替代品。他们说客户端多行支持已计划,因此消息将作为一个事件从客户端发送,而不必由 logstash 重新组装。