正则表达式从组末尾排除一个字符

Question

我正在尝试解析系统日志行：

pam_vas: Authentication <succeeded> for <active directory> user: <bobtheperson> account: <bobtheperson@com.com> reason: <N/A> Access cont(upn): <bob>

我的目标是将这些数据分成 key/value 对。它需要是 perl 正则表达式（这恰好进入 Splunk 以获取 solaris 日志，以防有人好奇它的用途）。

到目前为止，我有这个：

[\>\:]*\s+(.*?)\<(.+?)\>

它在提取我的数据方面做得很好，但任何以冒号结尾的单词都包含在第一组中。

预期结果：

Authentication = succeeded
for = active directory
user = bobtheperson
account = bobtheperson@com.com
reason = N/A
Access cont(upn) = bob

实际结果（注意冒号）

Authentication = succeeded
for = active directory
user: = bobtheperson
account: = bobtheperson@com.com
reason: = N/A
Access cont(upn): = bob

link到http://regexr.com/代码： http://regexr.com/3fasr 大量的试验和错误让我走到了这一步——我只是想不出如何去掉最后一个标点符号。

Answer 1

这个正则表达式似乎适合你：

[\>\:]*\s+(.*?)\:?\s\<(.+?)\>

我

Regex excluding a charachter from the end of a group