使用 fluentd 处理 JSON 内的多级嵌套转义 JSON 字符串
Process multi-level nested escaped JSON strings inside JSON with fluentd
我是 fluentd 的新手,我想解析 JSON.
中的多级嵌套转义 JSON 字符串
我的消息看起来像:
{"log":"HELLO WORLD\n","stream":"stdout","time":"2019-05-23T15:40:54.298531098Z"}
{"log":"{\"appName\":\"adapter\",\"time\":\"2019-05-23T15:40:54.299\",\"message\":\"{\\"level\\":\\"info\\",\\"message\\":\\"Awaiting Messages from queue...\\"}\"}\n","stream":"stdout","time":"2019-05-23T15:40:54.2996761Z"}
第一条消息被正确解析,但第二条消息被忽略,我猜这是因为解析格式错误
这是我的来源:
<source>
@id fluentd-containers.log
@type tail
path /var/log/containers/*.log
pos_file /var/log/containers.log.pos
tag raw.kubernetes.*
read_from_head true
<parse>
@type multi_format
<pattern>
format json
time_key time
time_format %Y-%m-%dT%H:%M:%S.%NZ
</pattern>
<pattern>
format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
time_format %Y-%m-%dT%H:%M:%S.%N%:z
</pattern>
</parse>
</source>
这是我尝试过的:
<filter **>
@type parser
key_name log
reserve_data true
remove_key_name_field true
hash_value_field parsed_log
<parse>
@type json
</parse>
</filter>
我实际上只是想解析这条日志消息:
{
"log":"{\"appName\":\"dedge-adapter\",\"time\":\"2019-05-24T02:39:12.242\",\"message\":\"{\\"level\\":\\"warn\\",\\"status\\":401,\\"method\\":\\"GET\\",\\"path\\":\\"/api/v1/bookings\\",\\"requestId\\":\\"782a470b-9d62-43d3-9865-1b67397717d4\\",\\"ip\\":\\"90.79.204.18\\",\\"latency\\":0.097897,\\"user-agent\\":\\"PostmanRuntime/7.11.0\\",\\"message\\":\\"Request\\"}\"}\n",
"stream":"stdout",
"time":"2019-05-24T02:39:12.242383376Z"
}
你们有多种格式的日志字段吗?
如果是这样,您可以使用 https://github.com/repeatedly/fluent-plugin-multi-format-parser
<source>
@type dummy
tag dummy
dummy [
{"log":"HELLO WORLD\n","stream":"stdout","time":"2019-05-23T15:40:54.298531098Z"},
{"log":"{\"appName\":\"adapter\",\"time\":\"2019-05-23T15:40:54.299\",\"message\":\"{\\"level\\":\\"info\\",\\"message\\":\\"Awaiting Messages from queue...\\"}\"}\n","stream":"stdout","time":"2019-05-23T15:40:54.2996761Z"}
]
</source>
<filter dummy>
@type parser
key_name log
reserve_data true
remove_key_name_field true
<parse>
@type multi_format
<pattern>
format json
</pattern>
<pattern>
format none
</pattern>
</parse>
</filter>
<filter dummy>
@type parser
key_name message
reserve_data true
remove_key_name_field true
<parse>
@type multi_format
<pattern>
format json
</pattern>
<pattern>
format none
</pattern>
</parse>
</filter>
<match dummy>
@type stdout
</match>
输出:
2019-06-03 11:41:13.022468253 +0900 dummy: {"stream":"stdout","time":"2019-05-23T15:40:54.298531098Z","message":"HELLO WORLD\n"}
2019-06-03 11:41:14.024253824 +0900 dummy: {"stream":"stdout","time":"2019-05-23T15:40:54.2996761Z","appName":"adapter","level":"info","message":"Awaiting Messages from queue..."}
我是 fluentd 的新手,我想解析 JSON.
中的多级嵌套转义 JSON 字符串我的消息看起来像:
{"log":"HELLO WORLD\n","stream":"stdout","time":"2019-05-23T15:40:54.298531098Z"}
{"log":"{\"appName\":\"adapter\",\"time\":\"2019-05-23T15:40:54.299\",\"message\":\"{\\"level\\":\\"info\\",\\"message\\":\\"Awaiting Messages from queue...\\"}\"}\n","stream":"stdout","time":"2019-05-23T15:40:54.2996761Z"}
第一条消息被正确解析,但第二条消息被忽略,我猜这是因为解析格式错误
这是我的来源:
<source>
@id fluentd-containers.log
@type tail
path /var/log/containers/*.log
pos_file /var/log/containers.log.pos
tag raw.kubernetes.*
read_from_head true
<parse>
@type multi_format
<pattern>
format json
time_key time
time_format %Y-%m-%dT%H:%M:%S.%NZ
</pattern>
<pattern>
format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
time_format %Y-%m-%dT%H:%M:%S.%N%:z
</pattern>
</parse>
</source>
这是我尝试过的:
<filter **>
@type parser
key_name log
reserve_data true
remove_key_name_field true
hash_value_field parsed_log
<parse>
@type json
</parse>
</filter>
我实际上只是想解析这条日志消息:
{
"log":"{\"appName\":\"dedge-adapter\",\"time\":\"2019-05-24T02:39:12.242\",\"message\":\"{\\"level\\":\\"warn\\",\\"status\\":401,\\"method\\":\\"GET\\",\\"path\\":\\"/api/v1/bookings\\",\\"requestId\\":\\"782a470b-9d62-43d3-9865-1b67397717d4\\",\\"ip\\":\\"90.79.204.18\\",\\"latency\\":0.097897,\\"user-agent\\":\\"PostmanRuntime/7.11.0\\",\\"message\\":\\"Request\\"}\"}\n",
"stream":"stdout",
"time":"2019-05-24T02:39:12.242383376Z"
}
你们有多种格式的日志字段吗? 如果是这样,您可以使用 https://github.com/repeatedly/fluent-plugin-multi-format-parser
<source>
@type dummy
tag dummy
dummy [
{"log":"HELLO WORLD\n","stream":"stdout","time":"2019-05-23T15:40:54.298531098Z"},
{"log":"{\"appName\":\"adapter\",\"time\":\"2019-05-23T15:40:54.299\",\"message\":\"{\\"level\\":\\"info\\",\\"message\\":\\"Awaiting Messages from queue...\\"}\"}\n","stream":"stdout","time":"2019-05-23T15:40:54.2996761Z"}
]
</source>
<filter dummy>
@type parser
key_name log
reserve_data true
remove_key_name_field true
<parse>
@type multi_format
<pattern>
format json
</pattern>
<pattern>
format none
</pattern>
</parse>
</filter>
<filter dummy>
@type parser
key_name message
reserve_data true
remove_key_name_field true
<parse>
@type multi_format
<pattern>
format json
</pattern>
<pattern>
format none
</pattern>
</parse>
</filter>
<match dummy>
@type stdout
</match>
输出:
2019-06-03 11:41:13.022468253 +0900 dummy: {"stream":"stdout","time":"2019-05-23T15:40:54.298531098Z","message":"HELLO WORLD\n"}
2019-06-03 11:41:14.024253824 +0900 dummy: {"stream":"stdout","time":"2019-05-23T15:40:54.2996761Z","appName":"adapter","level":"info","message":"Awaiting Messages from queue..."}