Logstash Grok 自定义 URIPATHPARAM
Logstash Grok custom URIPATHPARAM
如何在 grok 过滤器中拆分 URIPATHPARAM。
这是我的 grok 模式。
grok {
match => ["message", "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int} (?:%{IP:backend_ip}:%{NUMBER:backend_port:int}|-) %{NUMBER:request_processing_time:float} %{NUMBER:backend_processing_time:float} %{NUMBER:response_processing_time:float} (?:%{NUMBER:elb_status_code:int}|-) (?:%{NUMBER:backend_status_code:int}|-) %{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int} \"(?:%{WORD:verb}|-) (?:%{GREEDYDATA:request}|-) (?:HTTP/%{NUMBER:httpversion}|-( )?)\" \"%{DATA:userAgent}\"( %{NOTSPACE:ssl_cipher} %{NOTSPACE:ssl_protocol})?"]
}
grok {
match => [ "request", "%{URIPROTO:http_protocol}://(?:%{USER:user}(?::[^@]*)?@)?(?:%{URIHOST:refhost})?(?:%{URIPATHPARAM:uri_param})?" ]
}
}
值进入 URI_param
/a1/post/abcxyz/data/adfs/
/partner/uc/article/adafdf?adfaf
我想在单独的字段中捕获上面 url 的前三个字符串,例如
/a1/post/abcxyz
/partner/uc/article
在 uri_param 字段上使用下面的 grokpattern
%{THREESTRINGS:newField}
THREESTRINGS 的自定义模式是
THREESTRINGS \/\b\w+\b\/\b\w+\b\/\b\w+\b
grok {
match => ["message", "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int} (?:%{IP:backend_ip}:%{NUMBER:backend_port:int}|-) %{NUMBER:request_processing_time:float} %{NUMBER:backend_processing_time:float} %{NUMBER:response_processing_time:float} (?:%{NUMBER:elb_status_code:int}|-) (?:%{NUMBER:backend_status_code:int}|-) %{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int} \"(?:%{WORD:verb}|-) (?:%{GREEDYDATA:request}|-) (?:HTTP/%{NUMBER:httpversion}|-( )?)\" \"%{DATA:userAgent}\"( %{NOTSPACE:ssl_cipher} %{NOTSPACE:ssl_protocol})?"]
}
grok {
match => [ "request", "%{URIPROTO:http_protocol}://(?:%{USER:user}(?::[^@]*)?@)?(?:%{URIHOST:refhost})?(?:%{URIPATHPARAM:uri_param})?" ]
}
if [uri_param] {
mutate {
split => { "uri_param" => "/"}
add_field => { "uri_param_1" => "%{[uri_param][1]}" }
add_field => { "uri_param_2" => "%{[uri_param][2]}" }
add_field => { "uri_param_3" => "%{[uri_param][3]}" }
}
}
或者,您也可以从 grok 本身获取这三个参数。
喜欢
grok {
match => [ "request", "%{URIPROTO:http_protocol}://(?:%{USER:user}(?::[^@]*)?@)?(?:%{URIHOST:refhost})?(?:/%{WORD:uri_param_1}/%{WORD:uri_param_2}/%{WORD:uri_param_3}/%{GREEDYDATA:other_params})?" ]
}
按照您的要求,再次加入他们,您可以简单地使用 mutate 过滤器:
mutate {
add_field => { "uri_param" => "/%{[uri_param_1]}/%{[uri_param_2]}/%{[uri_param_3]}/%{[other_params]}"}
}
我希望这会奏效,只需测试一下,然后告诉我是否对你有用。
如何在 grok 过滤器中拆分 URIPATHPARAM。
这是我的 grok 模式。
grok {
match => ["message", "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int} (?:%{IP:backend_ip}:%{NUMBER:backend_port:int}|-) %{NUMBER:request_processing_time:float} %{NUMBER:backend_processing_time:float} %{NUMBER:response_processing_time:float} (?:%{NUMBER:elb_status_code:int}|-) (?:%{NUMBER:backend_status_code:int}|-) %{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int} \"(?:%{WORD:verb}|-) (?:%{GREEDYDATA:request}|-) (?:HTTP/%{NUMBER:httpversion}|-( )?)\" \"%{DATA:userAgent}\"( %{NOTSPACE:ssl_cipher} %{NOTSPACE:ssl_protocol})?"]
}
grok {
match => [ "request", "%{URIPROTO:http_protocol}://(?:%{USER:user}(?::[^@]*)?@)?(?:%{URIHOST:refhost})?(?:%{URIPATHPARAM:uri_param})?" ]
}
}
值进入 URI_param
/a1/post/abcxyz/data/adfs/
/partner/uc/article/adafdf?adfaf
我想在单独的字段中捕获上面 url 的前三个字符串,例如
/a1/post/abcxyz
/partner/uc/article
在 uri_param 字段上使用下面的 grokpattern
%{THREESTRINGS:newField}
THREESTRINGS 的自定义模式是
THREESTRINGS \/\b\w+\b\/\b\w+\b\/\b\w+\b
grok {
match => ["message", "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int} (?:%{IP:backend_ip}:%{NUMBER:backend_port:int}|-) %{NUMBER:request_processing_time:float} %{NUMBER:backend_processing_time:float} %{NUMBER:response_processing_time:float} (?:%{NUMBER:elb_status_code:int}|-) (?:%{NUMBER:backend_status_code:int}|-) %{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int} \"(?:%{WORD:verb}|-) (?:%{GREEDYDATA:request}|-) (?:HTTP/%{NUMBER:httpversion}|-( )?)\" \"%{DATA:userAgent}\"( %{NOTSPACE:ssl_cipher} %{NOTSPACE:ssl_protocol})?"]
}
grok {
match => [ "request", "%{URIPROTO:http_protocol}://(?:%{USER:user}(?::[^@]*)?@)?(?:%{URIHOST:refhost})?(?:%{URIPATHPARAM:uri_param})?" ]
}
if [uri_param] {
mutate {
split => { "uri_param" => "/"}
add_field => { "uri_param_1" => "%{[uri_param][1]}" }
add_field => { "uri_param_2" => "%{[uri_param][2]}" }
add_field => { "uri_param_3" => "%{[uri_param][3]}" }
}
}
或者,您也可以从 grok 本身获取这三个参数。 喜欢
grok {
match => [ "request", "%{URIPROTO:http_protocol}://(?:%{USER:user}(?::[^@]*)?@)?(?:%{URIHOST:refhost})?(?:/%{WORD:uri_param_1}/%{WORD:uri_param_2}/%{WORD:uri_param_3}/%{GREEDYDATA:other_params})?" ]
}
按照您的要求,再次加入他们,您可以简单地使用 mutate 过滤器:
mutate {
add_field => { "uri_param" => "/%{[uri_param_1]}/%{[uri_param_2]}/%{[uri_param_3]}/%{[other_params]}"}
}
我希望这会奏效,只需测试一下,然后告诉我是否对你有用。