awk 匹配一行中的多个正则表达式字符串和数字

Question

我正在尝试匹配 httpd 日志文件中每行中的多个项目。这些行看起来像这样：

192.168.0.1 - - [06/Apr/2016:16:35:42 +0100] "-"  "100" "GET /breacher/gibborum.do?firstnumber=1238100121135&simple=1238100121135&protocol=http&_super=telco1 HTTP/1.1" 200 161 "-" "NING/1.0"
192.168.0.1 - - [06/Apr/2016:16:35:44 +0100] "-"  "00" "GET /breacher/gibborum.do?firstnumber=1237037630256&simple=1237037630256&protocol=http&_super=telco1 HTTP/1.1" 200 136 "-" "NING/1.0"
192.168.0.1 - - [06/Apr/2016:16:35:44 +0100] "-"  "00" "GET /breacher/gibborum.do?firstnumber=1238064400578&simple=1238064400578&protocol=http&_super=telco1 HTTP/1.1" 200 136 "-" "NING/1.0"

我正在尝试提取数字、时间戳和 _super 变量的值。到目前为止，我可以用这个提取数字和时间戳：

 awk '{match ([=11=], /123([0-9]+)/, arr); print , arr[0]}'

请问如何提取 _super= 变量末尾的值？

Answer 1

您可以这样更改脚本：（添加 gsub 和 </code>）：</p> <pre><code>awk '{match ([=10=], /123([0-9]+)/, arr); gsub(/.*_super=/, "",); print , arr[0], }'

awk 匹配一行中的多个正则表达式字符串和数字

awk match multiple regular expression strings and digits from a line

linux

awk

gnu