Logstash grok 多重匹配
Logstash grok multiple match
尝试使用 grok 从 msgbody 字段中提取一些字段,但仅提取了 grok 中的第一个字段。
感兴趣的字段 - corId
、controller
、httpStatusText
和 uri
(这些字段可能不会出现在每个日志事件中)
样本数据-
2020-01-03 10:44:17,025 [93] ERROR MedServFileLogger corId=cf25b00d-1e37-4eb7-ab75-82ceeec7fdab - Exception controller= Loan action= Getmethod= GET uri= http://xxxxxxxxxx/v2/media/instance/xxxx/loans/cdb79433-32fa-4df8-b73a-e87aa89f2007/files/images-178ee8d0-fa48-4b9f-a8df-abcc9cfb1ac7.zip/entries/0b3e99f8-8af8-49a5-95b1-1537c715eb43.png?tokencreator=Encompass&tokenexpires=1578076775&token=pLCvT%2F1pBPhuFXiHKDIlB5F9feocqeq7Wxx%2FyhAz7B6DCcKeOP3YjO%2FnalfjTgXdieAmyFHEiW72Soym14oBuw%3D%3D
System.UnauthorizedAccessException: MediaTokenInvalid - A valid Token must be provided for accessing Media
2020-01-03 03:58:12,822 [37] ERROR MedServFileLogger corId=5aa9b90b-9fe6-4700-aa8f-c08be2d3f0ea - Returning controller= Health action= Getmethod= GET uri= http://localhost/v2/media/healthhttpStatusCode=503 httpStatusText=ServiceUnavailable
2020-01-03 14:13:33,987 [62] INFO MedServFileLogger corId=2aee7503-d01e-4251-a9c3-f5f22057744b - Entering MediaTokenUtility.ValidateToken
2020-01-03 14:13:33,987 [62] INFO EDMMediaServiceFileLogger corId=2aee7503-d01e-4251-a9c3-f5f22057744b - Entering controller= Vault action= GetFilemethod= GET uri= http://xxxxxxx.com/v2/media/vault/files/42793a9e-5123-42ae-811d-fa57b488af27?tokencreator=EDel&tokenexpires=1578352413&token=KZIc14KV0gSDlwxXhekUbMEPks3eBcc8hAHAXd7gpm8%2bDlDayNf3Vqu%2fA%2broswY3O5aOLkpq5u7fErOqtQccMA%3d%3d
Logstash 过滤器 -
filter {
if [project] == "media_server"
{
grok {
match => [ "message", "(?m)%{TIMESTAMP_ISO8601:logtime} [(?[\d.]+)] +%{LOGLEVEL:loglevel} %{GREEDYDATA:msgbody}" ]
}
grok {
match => {
break_on_match => "false"
"msgbody" => [ "corId=%{UUID:corId}", "controller=%{SPACE}%{WORD:controller}", "httpStatusText=%{WORD:httpStatusText}", "uri=%{SPACE}%{URI:uri}" ]
}
}
date {
locale => "en"
match => ["logtime", "YYYY-MM-dd HH:mm:ss,SSS"]
timezone => "America/Los_Angeles"
target => "@timestamp"
}
mutate
{
remove_field => [ "msgbody" ]
}
}
}
使用上述配置,只有 corId
字段被提取,所有其他字段都被删除。我在 logstash 日志中没有看到任何解析 errors/failures。
关注这个 -
https://www.elastic.co/guide/en/logstash/5.5/plugins-filters-grok.html#plugins-filters-grok-match
感谢任何帮助或指导。
谢谢
通过在 grok 过滤器内和 "match" 节之前包含以下 "patterns_dir" 和 "break_on_match" 使其工作。
patterns_dir => "/etc/logstash/patterns"
break_on_match => "false"
工作过滤器 -
filter {
if [project] == "media_server"
{
grok {
match => [ "message", "(?m)%{TIMESTAMP_ISO8601:logtime} \[(?<threadid>[\d.]+)\] +%{LOGLEVEL:loglevel} %{GREEDYDATA:msgbody}" ]
}
grok {
patterns_dir => "/etc/logstash/patterns"
break_on_match => "false"
match => {
"msgbody" => [ "corId=%{UUID:corId}", "controller=%{SPACE}%{WORD:controller}", "httpStatusText=%{WORD:httpStatusText}", "uri=%{SPACE}%{URI:uri}" ]
}
}
date {
locale => "en"
match => ["logtime", "YYYY-MM-dd HH:mm:ss,SSS"]
timezone => "America/Los_Angeles"
target => "@timestamp"
}
mutate
{
remove_field => [ "msgbody" ]
}
}
}
尝试使用 grok 从 msgbody 字段中提取一些字段,但仅提取了 grok 中的第一个字段。
感兴趣的字段 - corId
、controller
、httpStatusText
和 uri
(这些字段可能不会出现在每个日志事件中)
样本数据-
2020-01-03 10:44:17,025 [93] ERROR MedServFileLogger corId=cf25b00d-1e37-4eb7-ab75-82ceeec7fdab - Exception controller= Loan action= Getmethod= GET uri= http://xxxxxxxxxx/v2/media/instance/xxxx/loans/cdb79433-32fa-4df8-b73a-e87aa89f2007/files/images-178ee8d0-fa48-4b9f-a8df-abcc9cfb1ac7.zip/entries/0b3e99f8-8af8-49a5-95b1-1537c715eb43.png?tokencreator=Encompass&tokenexpires=1578076775&token=pLCvT%2F1pBPhuFXiHKDIlB5F9feocqeq7Wxx%2FyhAz7B6DCcKeOP3YjO%2FnalfjTgXdieAmyFHEiW72Soym14oBuw%3D%3D
System.UnauthorizedAccessException: MediaTokenInvalid - A valid Token must be provided for accessing Media
2020-01-03 03:58:12,822 [37] ERROR MedServFileLogger corId=5aa9b90b-9fe6-4700-aa8f-c08be2d3f0ea - Returning controller= Health action= Getmethod= GET uri= http://localhost/v2/media/healthhttpStatusCode=503 httpStatusText=ServiceUnavailable
2020-01-03 14:13:33,987 [62] INFO MedServFileLogger corId=2aee7503-d01e-4251-a9c3-f5f22057744b - Entering MediaTokenUtility.ValidateToken
2020-01-03 14:13:33,987 [62] INFO EDMMediaServiceFileLogger corId=2aee7503-d01e-4251-a9c3-f5f22057744b - Entering controller= Vault action= GetFilemethod= GET uri= http://xxxxxxx.com/v2/media/vault/files/42793a9e-5123-42ae-811d-fa57b488af27?tokencreator=EDel&tokenexpires=1578352413&token=KZIc14KV0gSDlwxXhekUbMEPks3eBcc8hAHAXd7gpm8%2bDlDayNf3Vqu%2fA%2broswY3O5aOLkpq5u7fErOqtQccMA%3d%3d
Logstash 过滤器 -
filter {
if [project] == "media_server"
{
grok {
match => [ "message", "(?m)%{TIMESTAMP_ISO8601:logtime} [(?[\d.]+)] +%{LOGLEVEL:loglevel} %{GREEDYDATA:msgbody}" ]
}
grok {
match => {
break_on_match => "false"
"msgbody" => [ "corId=%{UUID:corId}", "controller=%{SPACE}%{WORD:controller}", "httpStatusText=%{WORD:httpStatusText}", "uri=%{SPACE}%{URI:uri}" ]
}
}
date {
locale => "en"
match => ["logtime", "YYYY-MM-dd HH:mm:ss,SSS"]
timezone => "America/Los_Angeles"
target => "@timestamp"
}
mutate
{
remove_field => [ "msgbody" ]
}
}
}
使用上述配置,只有 corId
字段被提取,所有其他字段都被删除。我在 logstash 日志中没有看到任何解析 errors/failures。
关注这个 -
https://www.elastic.co/guide/en/logstash/5.5/plugins-filters-grok.html#plugins-filters-grok-match
感谢任何帮助或指导。
谢谢
通过在 grok 过滤器内和 "match" 节之前包含以下 "patterns_dir" 和 "break_on_match" 使其工作。
patterns_dir => "/etc/logstash/patterns"
break_on_match => "false"
工作过滤器 -
filter {
if [project] == "media_server"
{
grok {
match => [ "message", "(?m)%{TIMESTAMP_ISO8601:logtime} \[(?<threadid>[\d.]+)\] +%{LOGLEVEL:loglevel} %{GREEDYDATA:msgbody}" ]
}
grok {
patterns_dir => "/etc/logstash/patterns"
break_on_match => "false"
match => {
"msgbody" => [ "corId=%{UUID:corId}", "controller=%{SPACE}%{WORD:controller}", "httpStatusText=%{WORD:httpStatusText}", "uri=%{SPACE}%{URI:uri}" ]
}
}
date {
locale => "en"
match => ["logtime", "YYYY-MM-dd HH:mm:ss,SSS"]
timezone => "America/Los_Angeles"
target => "@timestamp"
}
mutate
{
remove_field => [ "msgbody" ]
}
}
}