grok:我在哪里定义自定义 grok 标签

grok: where do I define custom grok tags

我需要使用 Logstash 解析 nginx 日志,我发现了这个问题:

Nginx grok pattern for logstash

我想尝试问题中的模式,所以我创建了包含以下内容的配置文件:

input {
  file {
    path => "/var/log/nginx/access.log"
    type => "access"
  }
}

filter {
  grok {
    match => {
      "message" => '%{IPORHOST:clientip} %{NGUSER:ident} %{NGUSER:auth} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|-)\"%{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NOTSPACE:querystring} (?:"(?:%{URI:referrer}|-)"|%{QS:referrer})%{QS:agent}  %{IPORHOST:forwardedfor} %{IPORHOST:host} %{NUMBER:upstreamresponse} (?:-|%{NUMBER:cache})'
    }
  }
}

output {
  stdout {}
}

但我收到错误:

[2021-03-24T13:25:53,424][ERROR][logstash.javapipeline    ][main] Pipeline error {:pipeline_id=>"main", :exception=>#<Grok::PatternError: pattern %{NGUSER:ident} not defined>, :backtrace=>["/home/ubuntu/logstash-7.12.0/vendor/bundle/jruby/2.5.0/gems/jls-grok-0.11.5/lib/grok-pure.rb:123:in `block in compile'", "org/jruby/RubyKernel.java:1442:in `loop'", "/home/ubuntu/logstash-7.12.0/vendor/bundle/jruby/2.5.0/gems/jls-grok-0.11.5/lib/grok-pure.rb:93:in `compile'", "/home/ubuntu/logstash-7.12.0/vendor/bundle/jruby/2.5.0/gems/logstash-filter-grok-4.4.0/lib/logstash/filters/grok.rb:282:in `block in register'", "org/jruby/RubyArray.java:1809:in `each'", "/home/ubuntu/logstash-7.12.0/vendor/bundle/jruby/2.5.0/gems/logstash-filter-grok-4.4.0/lib/logstash/filters/grok.rb:276:in `block in register'", "org/jruby/RubyHash.java:1415:in `each'", "/home/ubuntu/logstash-7.12.0/vendor/bundle/jruby/2.5.0/gems/logstash-filter-grok-4.4.0/lib/logstash/filters/grok.rb:271:in `register'", "org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:75:in `register'", "/home/ubuntu/logstash-7.12.0/logstash-core/lib/logstash/java_pipeline.rb:228:in `block in register_plugins'", "org/jruby/RubyArray.java:1809:in `each'", "/home/ubuntu/logstash-7.12.0/logstash-core/lib/logstash/java_pipeline.rb:227:in `register_plugins'", "/home/ubuntu/logstash-7.12.0/logstash-core/lib/logstash/java_pipeline.rb:586:in `maybe_setup_out_plugins'", "/home/ubuntu/logstash-7.12.0/logstash-core/lib/logstash/java_pipeline.rb:240:in `start_workers'", "/home/ubuntu/logstash-7.12.0/logstash-core/lib/logstash/java_pipeline.rb:185:in `run'", "/home/ubuntu/logstash-7.12.0/logstash-core/lib/logstash/java_pipeline.rb:137:in `block in start'"], "pipeline.sources"=>["/home/ubuntu/logstash-7.12.0/config/pipeline.conf"], :thread=>"#<Thread:0x175b972c run>"}

在这种特殊情况下,我想让 NGUSER 等于 [a-zA-Z\.\@\-\+_%]+

所以我的问题来了:我在哪里定义自定义 grok 标签?

您可以使用 grok 过滤器的 pattern_definitions 选项

grok {
    pattern_definition => {
        "NGUSER" => "[a-zA-Z\.\@\-\+_%]+"
    }
    match => { "message" => '%{IPORHOST:clientip} %{NGUSER:ident} %{NGUSER:auth} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|-)\"%{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NOTSPACE:querystring} (?:"(?:%{URI:referrer}|-)"|%{QS:referrer})%{QS:agent}  %{IPORHOST:forwardedfor} %{IPORHOST:host} %{NUMBER:upstreamresponse} (?:-|%{NUMBER:cache})' }
}

或者,如果您认为您可能想要在多个实例之间共享模式,您可以使用 patterns_dir 选项

grok {
    patterns_dir => [ "/home/user/patterns/" ]
    match => { "message" => '%{IPORHOST:clientip} %{NGUSER:ident} %{NGUSER:auth} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|-)\"%{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NOTSPACE:querystring} (?:"(?:%{URI:referrer}|-)"|%{QS:referrer})%{QS:agent}  %{IPORHOST:forwardedfor} %{IPORHOST:host} %{NUMBER:upstreamresponse} (?:-|%{NUMBER:cache})' }
}

并在该目录中创建一个包含

的文件
NGUSER [a-zA-Z\.\@\-\+_%]+