grok 调试器验证了 logstash 最终拒绝的日志条目

grokdebugger validates entries of a log that logstash eventually refuses

我使用 grokdebugger 调整了我在 Internet 上找到的内容,以便我第一次尝试处理 logback spring-boot 类型的日志。

这是发送到 grokdebugger 的日志条目:

2022-03-09 06:35:15,821 [http-nio-9090-exec-1] WARN  org.springdoc.core.OpenAPIService - found more than one OpenAPIDefinition class. springdoc-openapi will be using the first one found.

使用 grok 模式:
(?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}) \[(?<thread>(.*?)+)\] %{LOGLEVEL:level}\s+%{GREEDYDATA:class} - (?<logmessage>.*)

并根据需要发送其内容:

{
  "timestamp": [
    [
      "2022-03-09 06:35:15,821"
    ]
  ],
  "YEAR": [
    [
      "2022"
    ]
  ],
  "MONTHNUM": [
    [
      "03"
    ]
  ],
  "MONTHDAY": [
    [
      "09"
    ]
  ],
  "TIME": [
    [
      "06:35:15,821"
    ]
  ],
  "HOUR": [
    [
      "06"
    ]
  ],
  "MINUTE": [
    [
      "35"
    ]
  ],
  "SECOND": [
    [
      "15,821"
    ]
  ],
  "thread": [
    [
      "http-nio-9090-exec-1"
    ]
  ],
  "level": [
    [
      "WARN"
    ]
  ],
  "class": [
    [
      "org.springdoc.core.OpenAPIService"
    ]
  ],
  "logmessage": [
    [
      "found more than one OpenAPIDefinition class. springdoc-openapi will be using the first one found."
    ]
  ]
}

但是当我在 logstash 中请求相同的操作时,我在配置中设置了 input 声明:

input {
    file {
        path => "/home/lebihan/dev/Java/comptes-france/metier-et-gestion/dev/ApplicationMetierEtGestion/sparkMetier.log"

        codec => multiline {
           pattern => "^%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}.*"
           negate => "true"
           what => "previous"
        }
    }
}

filter 声明:

filter {
  #If log line contains tab character followed by 'at' then we will tag that entry as stacktrace
  if [message] =~ "\tat" {
    grok {
      match => ["message", "^(\tat)"]
      add_tag => ["stacktrace"]
    }
  }
 
 grok {
    match => [ "message",
               "(?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}) \[(?<thread>(.*?)+)\] %{LOGLEVEL:level}\s+%{GREEDYDATA:class} - (?<logmessage>.*)"
             ]
  }
  
  date {
    match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss.SSS" ]
  }
}

但是它解析失败,我不知道如何获得关于_grokparsefailure提到的潜在错误的额外内容。

造成我麻烦的主要原因是:

grok {
      match => [ 

而不是:

grok {
      match => {

但在那之后,我不得不改变:

  • %{TIMESTAMP_ISO8601:timestamp}
  • 的时间戳定义
  • 日期匹配
  • 并在日期匹配中添加一个目标以避免

避免 _dateparsefailure.

@timestamp:
    Mar 16, 2022 @ 09:14:22.002
@version:
    1
class:
    f.e.service.AbstractSparkDataset
host:
    debian
level:
    INFO
logmessage:
    Un dataset a été sauvegardé dans le fichier parquet /data/tmp/balanceComptesCommunes_2019_2019.
thread:
    http-nio-9090-exec-10
timestamp:
    2022-03-16T06:34:09.394Z
_id:
    8R_KkX8BBIYNTaMw1Jfg
_index:
    ecoemploimetier-2022.03.16
_score:
    - 
_type:
    _doc 

我最终像这样更正了我的 logstash 配置文件:

input {
    file {
        path => "/home/[...]/myLog.log"

        sincedb_path => "/dev/null"
        start_position => "beginning"

        codec => multiline {
           pattern => "^%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}.*"
           negate => "true"
           what => "previous"
        }
    }
}

filter {
   #If log line contains tab character followed by 'at' then we will tag that entry as stacktrace
   if [message] =~ "\tat" {
      grok {
         match => ["message", "^(\tat)"]
         add_tag => ["stacktrace"]
      }
   }
 
   grok {
      match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} \[(?<thread>(.*?)+)\] %{LOGLEVEL:level} %{GREEDYDATA:class} - (?<logmessage>.*)" }
   }
 
   date {
      # 2022-03-16 07:32:24,860
      match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss,SSS" ]
      target => "timestamp"
    }

   # S'il n'y a pas d'erreur de parsing, supprimer le message d'origine, non parsé
   if "_grokparsefailure" not in [tags] {
      mutate {
         remove_field => [ "message", "path" ]
      }
   }
}

output {
    stdout { codec => rubydebug }

    elasticsearch {
        hosts => ["localhost:9200"]
        index => "ecoemploimetier-%{+YYYY.MM.dd}"
    }
}