grok 过滤器(正则表达式)提取方括号内的字符串
grok filter (regex) to extract string within square brackets
我的应用程序日志条目如下:
2015-06-24 14:03:16.7288 Sent request message [649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74] <Request>sometext</Request>
2015-06-24 14:38:05.2460 Received response message [649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74] <Response>sometext</Response>
我正在使用 logstash grok 过滤器提取 xml 内容和带方括号的客户端令牌。
grok {
match => ["message", "(?<content>(<Request(.)*?</Request>))"]
match => ["message", "(?<clienttoken>(Sent request message \[(.)*?\]))"]
add_tag => "Request"
break_on_match => false
tag_on_failure => [ ]
}
grok {
match => ["message", "(?<content>(<Response(.)*?</Response>))"]
match => ["message", "(?<clienttoken>(Received response message \[(.)*?\]))"]
add_tag => "Response"
break_on_match => false
tag_on_failure => [ ]
}
现在结果如下所示
对于第一行日志:
Content = <Request>sometext</Request>
clienttoken = Sent request message [649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74]
对于第二个日志行:
Content = <Response>sometext</Response>
clienttoken = Received response message [649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74]
但我希望结果是这样的:
Content = <Request>sometext</Request>
clienttoken = 649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74
请告诉我如何只提取方括号内的字符串,而不提取模式中的所有匹配字符串。
您可以使用 lookbehind 和 lookahead 断言。
(?<=Sent request message \[).*?(?=\])
响应消息同样如此。
我的应用程序日志条目如下:
2015-06-24 14:03:16.7288 Sent request message [649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74] <Request>sometext</Request>
2015-06-24 14:38:05.2460 Received response message [649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74] <Response>sometext</Response>
我正在使用 logstash grok 过滤器提取 xml 内容和带方括号的客户端令牌。
grok {
match => ["message", "(?<content>(<Request(.)*?</Request>))"]
match => ["message", "(?<clienttoken>(Sent request message \[(.)*?\]))"]
add_tag => "Request"
break_on_match => false
tag_on_failure => [ ]
}
grok {
match => ["message", "(?<content>(<Response(.)*?</Response>))"]
match => ["message", "(?<clienttoken>(Received response message \[(.)*?\]))"]
add_tag => "Response"
break_on_match => false
tag_on_failure => [ ]
}
现在结果如下所示
对于第一行日志:
Content = <Request>sometext</Request>
clienttoken = Sent request message [649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74]
对于第二个日志行:
Content = <Response>sometext</Response>
clienttoken = Received response message [649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74]
但我希望结果是这样的:
Content = <Request>sometext</Request>
clienttoken = 649b85fa-bfa0-4cb4-8c38-1aeacd1cbf74
请告诉我如何只提取方括号内的字符串,而不提取模式中的所有匹配字符串。
您可以使用 lookbehind 和 lookahead 断言。
(?<=Sent request message \[).*?(?=\])
响应消息同样如此。