自 5.2 以来,grok 过滤器因 ISO8601 时间戳而失败
grok filter fails for ISO8601 timestamps since 5.2
自从我将 ELK-stack 从 5.0.2 升级到 5.2 后,我们的 grok 过滤器就失败了,我不知道为什么。也许我忽略了变更日志中的某些内容?
过滤器
filter {
if [type] == "nginx_access" {
grok {
match => { "message" => "%{IPORHOST:remote_addr} - %{USERNAME:remote_user} \[%{TIMESTAMP_ISO8601:timestamp}\] \"%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}\" %{INT:status} %{INT:body_bytes_sent} %{QS:http_referer} %{QS:http_user_agent} \"%{DATA:host_uri}\" \"%{DATA:proxy}\" \"%{DATA:upstream_addr}\" \"%{WORD:cache_status}\" \[%{NUMBER:request_time}\] \[(?:%{NUMBER:proxy_response_time}|-)\]" }
add_field => [ "received_at", "%{@timestamp}" ]
}
mutate {
convert => {
"proxy_response_time" => "float"
"request_time" => "float"
"body_bytes_sent" => "integer"
}
}
}
}
错误
Invalid format: \"2017-02-05T15:55:38+01:00\" is malformed at \"-02-05T15:55:38+01:00\"
完全错误
[2017-02-05T15:55:49,500][WARN ][logstash.outputs.elasticsearch] Failed action. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-2017.02.05", :_type=>"nginx_access", :_routing=>nil}, 2017-02-05T14:55:38.000Z proxy2 4.3.2.1 - - [2017-02-05T15:55:38+01:00] "HEAD / HTTP/1.1" 200 0 "-" "Zabbix" "example.com" "host1:10040" "1.2.3.4:10040" "MISS" [0.095] [0.095]], :response=>{"index"=>{"_index"=>"filebeat-2017.02.05", "_type"=>"nginx_access", "_id"=>"AVoOxh7p5p68dsalXDFX", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [timestamp]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"2017-02-05T15:55:38+01:00\" is malformed at \"-02-05T15:55:38+01:00\""}}}}}
整个过程在 http://grokconstructor.appspot.com and the TIMESTAMP_ISO8601 still seems the right choice (https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns)
上完美运行
Techstack
- Ubuntu16.04
- 弹性搜索 5.2.0
- Logstash 5.2.0
- Filebeat 5.2.0
- Kibana 5.2.0
有idas吗?
干杯,
芬恩
更新
所以这个版本出于某种原因可以工作
filter {
if [type] == "nginx_access" {
grok {
match => { "message" => "%{IPORHOST:remote_addr} - %{USERNAME:remote_user} \[%{TIMESTAMP_ISO8601:timestamp}\] \"%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}\" %{INT:status} %{INT:body_bytes_sent} %{QS:http_referer} %{QS:http_user_agent} \"%{DATA:host_uri}\" \"%{DATA:proxy}\" \"%{DATA:upstream_addr}\" \"%{WORD:cache_status}\" \[%{NUMBER:request_time}\] \[(?:%{NUMBER:proxy_response_time}|-)\]" }
add_field => [ "received_at", "%{@timestamp}" ]
}
date {
match => [ "timestamp" , "yyyy-MM-dd'T'HH:mm:ssZ" ]
target => "timestamp"
}
mutate {
convert => {
"proxy_response_time" => "float"
"request_time" => "float"
"body_bytes_sent" => "integer"
}
}
}
}
如果有人能阐明为什么我必须重新定义有效的 ISO8601 日期,我会很高兴知道。
确保在文档中指定 timestamp
的 format,映射可能如下所示:
PUT index
{
"mappings": {
"your_index_type": {
"properties": {
"date": {
"type": "date",
"format": "yyyy-MM-ddTHH:mm:ss+01:SS" <-- make sure to give the correct one
}
}
}
}
}
如果您没有正确指定,Elasticsearch 将期望 ISO 格式的 timestamp 值。 或 你可以为你的 timestamp
字段做一个 date match,它在你的 filter 中看起来像这样:
date {
match => [ "timestamp" , "yyyy-MM-ddTHH:mm:ss+01:SS" ] <--match the timestamp (I'm not sure what +01:ss stands for, make sure it matches)
target => "timestamp"
locale => "en"
timezone => "UTC"
}
或者如果您愿意,您可以添加一个新字段并将其与 时间戳 匹配,然后如果您没有真正使用它,则可以将其删除,因为您有新字段的时间戳。希望对你有帮助。
自从我将 ELK-stack 从 5.0.2 升级到 5.2 后,我们的 grok 过滤器就失败了,我不知道为什么。也许我忽略了变更日志中的某些内容?
过滤器
filter {
if [type] == "nginx_access" {
grok {
match => { "message" => "%{IPORHOST:remote_addr} - %{USERNAME:remote_user} \[%{TIMESTAMP_ISO8601:timestamp}\] \"%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}\" %{INT:status} %{INT:body_bytes_sent} %{QS:http_referer} %{QS:http_user_agent} \"%{DATA:host_uri}\" \"%{DATA:proxy}\" \"%{DATA:upstream_addr}\" \"%{WORD:cache_status}\" \[%{NUMBER:request_time}\] \[(?:%{NUMBER:proxy_response_time}|-)\]" }
add_field => [ "received_at", "%{@timestamp}" ]
}
mutate {
convert => {
"proxy_response_time" => "float"
"request_time" => "float"
"body_bytes_sent" => "integer"
}
}
}
}
错误
Invalid format: \"2017-02-05T15:55:38+01:00\" is malformed at \"-02-05T15:55:38+01:00\"
完全错误
[2017-02-05T15:55:49,500][WARN ][logstash.outputs.elasticsearch] Failed action. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-2017.02.05", :_type=>"nginx_access", :_routing=>nil}, 2017-02-05T14:55:38.000Z proxy2 4.3.2.1 - - [2017-02-05T15:55:38+01:00] "HEAD / HTTP/1.1" 200 0 "-" "Zabbix" "example.com" "host1:10040" "1.2.3.4:10040" "MISS" [0.095] [0.095]], :response=>{"index"=>{"_index"=>"filebeat-2017.02.05", "_type"=>"nginx_access", "_id"=>"AVoOxh7p5p68dsalXDFX", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [timestamp]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"2017-02-05T15:55:38+01:00\" is malformed at \"-02-05T15:55:38+01:00\""}}}}}
整个过程在 http://grokconstructor.appspot.com and the TIMESTAMP_ISO8601 still seems the right choice (https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns)
上完美运行Techstack
- Ubuntu16.04
- 弹性搜索 5.2.0
- Logstash 5.2.0
- Filebeat 5.2.0
- Kibana 5.2.0
有idas吗?
干杯, 芬恩
更新
所以这个版本出于某种原因可以工作
filter {
if [type] == "nginx_access" {
grok {
match => { "message" => "%{IPORHOST:remote_addr} - %{USERNAME:remote_user} \[%{TIMESTAMP_ISO8601:timestamp}\] \"%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}\" %{INT:status} %{INT:body_bytes_sent} %{QS:http_referer} %{QS:http_user_agent} \"%{DATA:host_uri}\" \"%{DATA:proxy}\" \"%{DATA:upstream_addr}\" \"%{WORD:cache_status}\" \[%{NUMBER:request_time}\] \[(?:%{NUMBER:proxy_response_time}|-)\]" }
add_field => [ "received_at", "%{@timestamp}" ]
}
date {
match => [ "timestamp" , "yyyy-MM-dd'T'HH:mm:ssZ" ]
target => "timestamp"
}
mutate {
convert => {
"proxy_response_time" => "float"
"request_time" => "float"
"body_bytes_sent" => "integer"
}
}
}
}
如果有人能阐明为什么我必须重新定义有效的 ISO8601 日期,我会很高兴知道。
确保在文档中指定 timestamp
的 format,映射可能如下所示:
PUT index
{
"mappings": {
"your_index_type": {
"properties": {
"date": {
"type": "date",
"format": "yyyy-MM-ddTHH:mm:ss+01:SS" <-- make sure to give the correct one
}
}
}
}
}
如果您没有正确指定,Elasticsearch 将期望 ISO 格式的 timestamp 值。 或 你可以为你的 timestamp
字段做一个 date match,它在你的 filter 中看起来像这样:
date {
match => [ "timestamp" , "yyyy-MM-ddTHH:mm:ss+01:SS" ] <--match the timestamp (I'm not sure what +01:ss stands for, make sure it matches)
target => "timestamp"
locale => "en"
timezone => "UTC"
}
或者如果您愿意,您可以添加一个新字段并将其与 时间戳 匹配,然后如果您没有真正使用它,则可以将其删除,因为您有新字段的时间戳。希望对你有帮助。