格式化elasticsearch结果

Format elasticsearch result

我正在使用 Elasticsearch 检索一些日志:

http://localhost:9200/collection/_search?q=type:"log"

它给我带来了一些这样的点击:

        {
                "_index": "collection",
                "_type": "doc",
                "_id": "UL878GMBYKUUOvfyQJWl",
                "_score": 6.487114,
                "_source": {
                    "@version": "1",
                    "type": "log",
                    "message": "64.242.88.10;[07/Mar/2004:16:11:58 -0800];"GET /twiki/bin/view/TWiki/WikiSyntax HTTP/1.1\"; 200 7352\r",
                    "@timestamp": "2018-06-11T19:03:23.163Z",
                    "host": "logstash",
                    "path": "/opt/access_log.log"
                }
            }

每个匹配项都有一个 "message",就像 CSV "access_log.log" 中的一行。

但是每个有用的信息都在 "message" 中,只有一个大字符串。所以我需要以某种方式提取以识别服务器 IP (64.242.88.10) 例如。

如何使用“;”分割这个 "message" 字符串作为正则表达式,以便我只能获得我需要的数据?

您可以使用 grok filter plugin

Grok is a great way to parse unstructured log data into something structured and queryable.

This tool is perfect for syslog logs, apache and other webserver logs, mysql logs, and in general, any log format that is generally written for humans and not computer consumption.

Logstash ships with about 120 patterns by default. You can find them here: https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns. You can add your own trivially. (See the patterns_dir setting)

If you need help building patterns to match your logs, you will find the http://grokdebug.herokuapp.com and http://grokconstructor.appspot.com/ applications quite useful!