Logstash 和来自 Monlog 的嵌套 JSON;为什么数组转换为 JSON 字符串?

Logstash and nested JSON from Monlog; Why arrays are converted to JSON string?

我正在使用 PHP 和 Monolog。我正在将日志输出到 JSON 文件并使用 Gelf 到 Logstash,然后将日志发送到 ElasticSearch。

我遇到的问题是我在 Kibana 中缺少 extra 对象,并且 tags 字段被解释为字符串而不是嵌套对象。

知道如何说服 Logstash/Kibana,所以内部 JSON 字段被解析为 fields/objects 而不是 JSON 字符串吗?

这就是它在 Kibana 中的样子。

{
   "_index":"logstash-2018.08.30",
   "_type":"doc",
   "_id":"TtHbiWUBc7g5w1yM8X6f",
   "_version":1,
   "_score":null,
   "_source":{
      "ctxt_task":"taskName",
      "@version":"1",
      "http_method":"GET",
      "user_agent":"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0",
      "level":6,
      "message":"Finished task",
      "tags":"{\"hostname\":\"28571f0dc7e1\",\"region\":\"eu-west-1\",\"environment\":\"local\",\"processUniqueId\":\"5b87a4d843c20\"}",
      "url":"/assets/Logo.jpg",
      "ctxt_controller":"ControllerName",
      "memory_usage":"4 MB",
      "referrer":"https://local.project.net/account/login",
      "facility":"logger",
      "memory_peak_usage":"4 MB",
      "ctxt_timeElapsed":0.05187487602233887,
      "@timestamp":"2018-08-30T08:03:37.386Z",
      "ip":"172.18.0.1",
      "ctxt_start":1535616217.33417,
      "type":"gelf",
      "host":"18571f0dc7e9",
      "source_host":"172.18.0.8",
      "server":"local.project.net",
      "ctxt_end":1535616217.386045,
      "version":"1.0"
   },
   "fields":{
      "@timestamp":[
         "2018-08-30T08:03:37.386Z"
      ]
   },
   "sort":[
      1535616217386
   ]
}

我的日志看起来像:

{
   "message":"Finished task",
   "context":{
      "controller":"ControllerName",
      "task":"taskName",
      "timeElapsed":0.02964186668395996,
      "start":1535614742.840069,
      "end":1535614742.869711,
      "content":""
   },
   "level":200,
   "level_name":"INFO",
   "channel":"logger",
   "datetime":{
      "date":"2018-08-30 08:39:02.869850",
      "timezone_type":3,
      "timezone":"Europe/London"
   },
   "extra":{
      "memory_usage":"14 MB",
      "memory_peak_usage":"14 MB",
      "tags":{
         "hostname":"28571f0dc7e1",
         "region":"eu-west-1",
         "environment":"local",
         "processUniqueId":"5b879f16be3f1"
      }
   }
}

我的 logstash 配置:

input {
    tcp {
        port => 5000
    }
    gelf {
        port => 12201
        type => gelf
        codec => "json"
    }
}

output {
    elasticsearch {
        hosts => "172.17.0.1:9201"
    }
}

我的独白配置:

$gelfTransport = new \Gelf\Transport\UdpTransport(LOG_GELF_HOST, LOG_GELF_PORT);
            $gelfPublisher = new \Gelf\Publisher($gelfTransport);
            $gelfHandler = new \Monolog\Handler\GelfHandler($gelfPublisher, static::$logVerbosity);
            $gelfHandler->setFormatter(new \Monolog\Formatter\GelfMessageFormatter());

            // This is to prevent application from failing if `GelfHandler` fails for some reason
            $ignoreErrorHandlers = new \Monolog\Handler\WhatFailureGroupHandler([
                $gelfHandler
            ]);
            $logger->pushHandler($ignoreErrorHandlers);

编辑: 到目前为止,我的发现是,这是由 GelfMessageFormatter 将它们的数组转换为 JSON:

引起的
$val = is_scalar($val) || null === $val ? $val : $this->toJson($val);

netcat与嵌套JSON一起使用时,例如:

echo -n '{
"field": 1,
"nestedField1": {"nf1": 1.1, "nf2": 1.2, "2nestedfield":{"2nf1":1.11, "2nf2":1.12}}
}' | gzip -c | nc -u -w1 bomcheck-logstash 12201

然后 Kibana 中的数据看起来没问题

GELF 似乎不支持开箱即用的嵌套数据结构。 我决定使用本机 Logstash UDP 插件:

input {
    udp {
        port => 12514
        codec => "json"
    }

}

连同 Monolog LogstashFormatter

$connectionString = sprintf("udp://%s:%s", LOG_UDP_LOGSTASH_HOST, LOG_UDP_LOGSTASH_PORT);
$handler = new \Monolog\Handler\SocketHandler($connectionString);
$handler->setFormatter(new \Monolog\Formatter\LogstashFormatter('project', null, null, 'ctxt_', \Monolog\Formatter\LogstashFormatter::V1));
$logger->pushHandler($handler);

嵌套数据在 Kibana 中正确结束。