无法强制 GROK 解析器在 haproxy 日志上强制执行 integer/float 类型

can't force GROK parser to enforce integer/float types on haproxy logs

integer/long 或浮动无关紧要,像 time_duration 这样的字段(所有时间_* 真的)在 kibana logstash 索引中映射为字符串。

我试过使用 mutate (https://www.elastic.co/blog/little-logstash-lessons-part-using-grok-mutate-type-data) 也没有用。

如何在这些字段上正确强制执行数字类型而不是字符串?

我的/etc/logstash/conf.d/haproxy.conf:

input {
  syslog {
    type => haproxy
    port => 5515
  }
}
filter {
  if [type] == "haproxy" { 
    grok {
      patterns_dir => "/usr/local/etc/logstash/patterns"
      match => ["message", "%{HAPROXYHTTP}"]
      named_captures_only => true
    }
    geoip {
      source => "client_ip"
      target => "geoip"
      database => "/etc/logstash/GeoLiteCity.dat"
      add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
      add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
    }
    mutate {
      convert => [ "[geoip][coordinates]", "float"]
    }
  }
}

我的 HAPROXYHTTP 模式:

HAPROXYHTTP  %{IP:client_ip}:%{INT:client_port} \[%{HAPROXYDATE:accept_date}\] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{INT:time_request:int}/%{INT:time_queue:int}/%{INT:time_backend_connect:int}/%{INT:time_backend_response:int}/%{NOTSPACE:time_duration:int} %{INT:http_status_code} %{NOTSPACE:bytes_read:int} %{DATA:captured_request_cookie} %{DATA:captured_response_cookie} %{NOTSPACE:termination_state} %{INT:actconn:int}/%{INT:feconn:int}/%{INT:beconn:int}/%{INT:srvconn:int}/%{NOTSPACE:retries:int} %{INT:srv_queue:int}/%{INT:backend_queue:int} (\{%{HAPROXYCAPTUREDREQUESTHEADERS}\})?( )?(\{%{HAPROXYCAPTUREDRESPONSEHEADERS}\})?( )?"(<BADREQ>|(%{WORD:http_verb} (%{URIPROTO:http_proto}://)?(?:%{USER:http_user}(?::[^@]*)?@)?(?:%{URIHOST:http_host})?(?:%{URIPATHPARAM:http_request})?( HTTP/%{NUMBER:http_version})?))?"

很可能 Logstash 在这里做了正确的事情(您的配置看起来是正确的),但 Elasticsearch 如何映射字段是另一回事。如果某个 Elasticsearch 文档中的某个字段已动态映射为字符串,则添加到同一索引的后续文档也将映射为字符串,即使它们在源文档中是整数或浮点数。要更改此设置,您必须重新编制索引,但是使用基于时间序列的 Logstash 索引,您可以等到第二天获得新索引。