如何更改索引中的“消息”值

How to change “message” value in index

在 logstash pipeline 或 indexpattern 中如何更改“消息”字段中 CDN 日志的以下部分以分离或提取一些数据然后聚合它们。

<40> 2022-01-17T08:31:22Z logserver-5 testcdn[1]: {"method":"GET","scheme":"https","domain":"www.123.com","uri":"/product/10809350","ip":"66.249.65.174","ua":"Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)","country":"US","asn":15169,"content_type":"text/html; charset=utf-8","status":200,"server_port":443,"bytes_sent":1892,"bytes_received":1371,"upstream_time":0.804,"cache":"MISS","request_id":"b017d78db4652036250148216b0a290c"}

预期变化:

{"method":"GET","scheme":"https","domain":"www.123.com","uri":"/product/10809350","ip":"66.249.65.174","ua":"Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)","country":"US","asn":15169,"content_type":"text/html; charset=utf-8","status":200,"server_port":443,"bytes_sent":1892,"bytes_received":1371,"upstream_time":0.804,"cache":"MISS","request_id":"b017d78db4652036250148216b0a290c"}

因为这部分“<40> 2022-01-17T08:31:22Z logserver-5 testcdn[1]:”在 jason 中没有被解析,我不能根据国家、ASN 等文件创建可视化仪表板...

logstash 索引的原始日志为:

{
  "_index": "logstash-2022.01.17-000001",
  "_type": "_doc",
  "_id": "Qx8pZ34BhloLEkDviGxe",
  "_version": 1,
  "_score": 1,
  "_source": {
    "message": "<40> 2022-01-17T08:31:22Z logserver-5 testcdn[1]: {\"method\":\"GET\",\"scheme\":\"https\",\"domain\":\"www.123.com\",\"uri\":\"/product/10809350\",\"ip\":\"66.249.65.174\",\"ua\":\"Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)\",\"country\":\"US\",\"asn\":15169,\"content_type\":\"text/html; charset=utf-8\",\"status\":200,\"server_port\":443,\"bytes_sent\":1892,\"bytes_received\":1371,\"upstream_time\":0.804,\"cache\":\"MISS\",\"request_id\":\"b017d78db4652036250148216b0a290c\"}",
    "port": 39278,
    "@timestamp": "2022-01-17T08:31:22.100Z",
    "@version": "1",
    "host": "93.115.150.121"
  },
  "fields": {
    "@timestamp": [
      "2022-01-17T08:31:22.100Z"
    ],
    "port": [
      39278
    ],
    "@version": [
      "1"
    ],
    "host": [
      "93.115.150.121"
    ],
    "message": [
      "<40> 2022-01-17T08:31:22Z logserver-5 testcdn[1]: {\"method\":\"GET\",\"scheme\":\"https\",\"domain\":\"www.123.com\",\"uri\":\"/product/10809350\",\"ip\":\"66.249.65.174\",\"ua\":\"Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)\",\"country\":\"US\",\"asn\":15169,\"content_type\":\"text/html; charset=utf-8\",\"status\":200,\"server_port\":443,\"bytes_sent\":1892,\"bytes_received\":1371,\"upstream_time\":0.804,\"cache\":\"MISS\",\"request_id\":\"b017d78db4652036250148216b0a290c\"}"
    ],
    "host.keyword": [
      "93.115.150.121"
    ]
  }
}

谢谢

将这些配置添加到 logstash 配置的过滤部分:

#To parse the message field
grok {
    match => { "message" => "<%{NONNEGINT:syslog_pri}>\s+%{TIMESTAMP_ISO8601:syslog_timestamp}\s+%{DATA:sys_host}\s+%{NOTSPACE:sys_module}\s+%{GREEDYDATA:syslog_message}"}
}
#To replace message field with syslog_message
mutate {
    replace => [ "message", "%{syslog_message}" ]
}

消息字段被 syslog_message 替换后,您可以添加下面的 json 过滤器来解析 json 以分隔字段..

json {
    source => "syslog_message"
}

谢谢,这很有用,我从你对这个特定场景的建议中得到了一个想法: 以下编辑的 logstash.conf 解决了这个问题:

input {
  tcp {
        port => 5000
        codec => json
  }
}

filter {
   grok {
            match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{GREEDYDATA:Junk}: %{GREEDYDATA:request}"}
        }
   json { source => "request" }
}

output {
  stdout { codec => rubydebug }
  elasticsearch {
    hosts => ["elasticsearch:9200"]
    manage_template => false
    ecs_compatibility => disabled
    index => "logs-%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
  }
}

但我主要关心的是编辑配置文件,我更愿意在 kibana web ui 中进行任何更改而不是更改 logstash.conf,因为我们在组织中的不同场景中使用 elk 并且此类更改使 elk 服务器仅适用于特殊用途,而不适用于多种用途。 如何在不更改 logstash 配置文件的情况下获得这样的结果?