在 grok 过滤器中区分数字和字母

Differentiate between number and alphabet in grok filter

我有两条这样的日志行:

[2020-04-01][14:57:31]E: Step 8/13: Main workflow (Python) (8m:48s)
[2020-04-01][15:14:02]W: Cannot find Latest build with tag: 'ArtifactSizeBaseline' to calculate metric 'total artifacts size'.

和这样的匹配字符串

%{DATE:EventDate}\]\[%{TIME:EventTime}\](\s+)?%{WORD:Loglevel}:(\s+)?%{DATA:Step}:(\s+)%{GREEDYDATA:EventMessage}

语句一的输出应如下所示:

{'EventDate':'2020-04-01', 'EventTime':'14:57:31', 'LogLevel':'E', 'Step':'Step 8/13', 'EventMessage':'Main workflow (Python) (8m:48s)'}

理想情况下,第二行日志不包含步骤。所以,输出应该看起来像

{'EventDate':'2020-04-01', 'EventTime':'15:14:02', 'LogLevel':'W', 'Step':'', 'EventMessage':'Cannot find Latest build with tag: 'ArtifactSizeBaseline' to calculate metric 'total artifacts size'.'}

但我得到的是这个

{'EventDate':'2020-04-01', 'EventTime':'15:14:02', 'LogLevel':'W', 'Step':'Cannot find Latest build with tag: ', 'EventMessage':''ArtifactSizeBaseline' to calculate metric 'total artifacts size'.'}

有没有办法让匹配字符串区分这两个日志行?

这个正则表达式匹配两行:

%{DATE:EventDate}\]\[%{TIME:EventTime}\](\s+)?%{WORD:Loglevel}:\s+((?=Step\s\b)%{DATA:Step}:)?\s?%{GREEDYDATA:EventMessage}

如果发现单词 "Step" 后跟一个空格和一个数字,它会使用正面前瞻和可选的数据提取。

已针对本网站上的两条线进行测试:

https://grokconstructor.appspot.com/do/match

希望能帮到你