“_grokparsefailure” 即使 grok 模式有效

”_grokparsefailure” even though the grok pattern works

我正在尝试从两种不同类型的文件解析不同的日志行:slave 和 master。我确实在 Grok Dubugger 中测试了我的模式,它工作正常,但 kibana 中的标签字段是 _grokparsefailure。

这是我的配置文件

input {
    file { 
        type => "slave"
        path => "/home/mathis/Documents/**/intranet*.log"
        exclude =>"*8402.log"
        sincedb_path => '/dev/null'
        start_position => beginning
    }
    file { 
        type => "master"
        path => "/home/mathis/Documents/**/intranet*8402.log"
        sincedb_path => '/dev/null'
    }
}
filter {
    if [type] == "slave" {
        grok {
            match => { "message" => ["\[%{DATESTAMP:eventtime}\] \- %{USERNAME:user} \- %{IPV4:clientip} \- %{NUMBER} \- %{WORD} %{NUMBER:exectime} %{WORD} %{NUMBER:time} %{GREEDYDATA:data} %{NUMBER:waittime}","\[%{DATESTAMP:eventtime}\] \- Process status database sync \- %{WORD}\.%{WORD}\.%{WORD}\:%{NUMBER:slavenumb}\(\#%{NUMBER}\) \(load %{NUMBER:nbutilisateur} grace period 5 minutes\) %{GREEDYDATA}"] }
            remove_field => "message"
        }
    date {
                match => [ "eventtime", "dd/MM/YYYY HH:mm:ss.SSS" ]
            target => "@timestamp"
        }
    }
    if [type] == "master" {
        grok {
                match => {"message" => ["%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}(?<starttime>((?!<[0-9])%{HOUR}:)?%{MINUTE}(?::%{SECOND})(?![0-9]))"]}
                remove_field => "message"
        }
            date {
                match => [ "starttime", "HH:mm:ss","mm:ss" ]
            }
    }
        
}
output {
    elasticsearch {
        hosts => "127.0.0.1:9200"
        index => "logstash-local3-%{+YYYY.MM.dd}"
    }
}

这是我要解析的 3 行日志: (它们在我的 conf 文件中按照 groks 的顺序排列)

[24/06/2020 21:57:29.548] - Process status database sync - us1salx08167.corpnet2.com:8100(#53738) (load 0 grace period 5 minutes) : current date 2020/06/24 21:57:29 update date 2020/06/24 21:55:44 old state OK new state OK

[29/05/2020 07:41:51.354] - ih912865 - 10.104.149.128 - 93 - Transaction 7635 COMPLETED 318 ms wait time 3183 ms

   31730  31626  464 10970020     52:25 /plw/modules/bin/Lx86_64/opx2-intranet.exe -I /plw/modules/bin/Lx86_64/opx2-intranet.dxl -H /plw/modules/bin/Lx86_64 -L /plw/PLW_PROD/modules/preload-intranet.ini -- plw-sysconsole -port 8400 -logdir /plw/PLW_PROD/httpdocs/admin/log/ -slaves 2

所以,我不知道你是否已经解决了这个问题——但下面是你可以使用的东西。

N.B。我添加了几个额外的字段,但您可以轻松删除那些 [https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-remove_field].

当尝试您提供的表达式时,其中一个实际上在 grok 调试器中失败了,所以我只好自己从头开始重写它们,同时仍然保留变量名。

我注意到有很多数据是您根本没有收集到的。如果您想捕获更多,请告诉我。

第 1 行:

    [24/06/2020 21:57:29.548] - Process status database sync - us1salx08167.corpnet2.com:8100(#53738) (load 0 grace period 5 minutes) : current date 2020/06/24 21:57:29 update date 2020/06/24 21:55:44 old state OK new state OK

模式一:

    \[(?<eventtime>%{DATESTAMP})\] - Process status database sync - (?<host>%{HOSTNAME}):(?<slavenumber>%{NUMBER})(?<zz>\(#[\d]+\)) \(load (?<nbutilisateur>%{NUMBER}) grace period 5 minutes\)%{GREEDYDATA}

第 2 行:

    [29/05/2020 07:41:51.354] - ih912865 - 10.104.149.128 - 93 - Transaction 7635 COMPLETED 318 ms wait time 3183 ms

模式二:

    \[(?<eventtime>%{DATESTAMP})\] - (?<user>%{USER}) - (?<clientip>%{IPV4}) - %{NUMBER} - %{WORD} (?<exectime>%{NUMBER}) %{WORD} (?<ctime>%{NUMBER}) (?<ctimeunits>%{WORD}) wait time (?<waittime>%{NUMBER}) (?<waittimeunits>%{WORD})

第 3 行:

       31730  31626  464 10970020     52:25 /plw/modules/bin/Lx86_64/opx2-intranet.exe -I /plw/modules/bin/Lx86_64/opx2-intranet.dxl -H /plw/modules/bin/Lx86_64 -L /plw/PLW_PROD/modules/preload-intranet.ini -- plw-sysconsole -port 8400 -logdir /plw/PLW_PROD/httpdocs/admin/log/ -slaves 2

模式三:

    %{GREEDYDATA}(?<starttime>(?<=[\s])([\d]+:[\d]+))%{GREEDYDATA}