当节的内容与模式匹配时打印文件的节
print section(s) of a file when section's content match a pattern
我一直在使用 Go Replay 来捕获 HTTP 流量。
现在我留下了一个文本文件,其中每个请求都由 ''
分隔
1 10ef8cc77b962b383557265f5eb1922e5affa88e 1518086364760738000
HEAD /xyz/
Host: d.e.f
User-Agent: ...
...
Connection: Keep-Alive
1 3534a2e1d670c596a673a706c3031a6bec9d6b06 1518086364994132000
HEAD /abc/
Host: a.b.c
User-Agent: ...
...
Connection: Keep-Alive
1 06891fdbebd48cb23ffe6ed5964c3fadcceb9199 1518086366027862000
HEAD /abc/
Host: a.b.c
User-Agent: ...
...
Connection: Keep-Alive
我只想从该文件中提取(打印)与给定 Header Host: a.b.c
匹配的请求:
1 3534a2e1d670c596a673a706c3031a6bec9d6b06 1518086364994132000
HEAD /abc/
Host: a.b.c
User-Agent: ...
...
Connection: Keep-Alive
1 06891fdbebd48cb23ffe6ed5964c3fadcceb9199 1518086366027862000
HEAD /abc/
Host: a.b.c
User-Agent: ...
...
Connection: Keep-Alive
注意:输入文件可能还包含 POST 请求的二进制数据(如 Content-Type: image/png
):
POST /...
Content-Length: 26892
-----------------------------19579713013480936471158807818
Content-Disposition: form-data; name="upload"; filename="__fileCreatedFromDataURI__.png"
Content-Type: image/png
<89>PNG
^Z
^@^@^@^MIHDR^@
...
这可能会中断处理...
是否可以使用 awk/sed 等工具一次性实现?或者它可能需要编写一个简单的脚本(Python for ex)?我想我可以将输入拆分为多个文件,但这会导致文件数量过多。
GNU awk
方法:
awk 'BEGIN{ RS=ORS="" }/Host: a.b.c/; END{ ORS=""; print }' file
输出:
1 3534a2e1d670c596a673a706c3031a6bec9d6b06 1518086364994132000
HEAD /abc/
Host: a.b.c
User-Agent: ...
...
Connection: Keep-Alive
1 06891fdbebd48cb23ffe6ed5964c3fadcceb9199 1518086366027862000
HEAD /abc/
Host: a.b.c
User-Agent: ...
...
Connection: Keep-Alive
我一直在使用 Go Replay 来捕获 HTTP 流量。 现在我留下了一个文本文件,其中每个请求都由 ''
分隔1 10ef8cc77b962b383557265f5eb1922e5affa88e 1518086364760738000
HEAD /xyz/
Host: d.e.f
User-Agent: ...
...
Connection: Keep-Alive
1 3534a2e1d670c596a673a706c3031a6bec9d6b06 1518086364994132000
HEAD /abc/
Host: a.b.c
User-Agent: ...
...
Connection: Keep-Alive
1 06891fdbebd48cb23ffe6ed5964c3fadcceb9199 1518086366027862000
HEAD /abc/
Host: a.b.c
User-Agent: ...
...
Connection: Keep-Alive
我只想从该文件中提取(打印)与给定 Header Host: a.b.c
匹配的请求:
1 3534a2e1d670c596a673a706c3031a6bec9d6b06 1518086364994132000
HEAD /abc/
Host: a.b.c
User-Agent: ...
...
Connection: Keep-Alive
1 06891fdbebd48cb23ffe6ed5964c3fadcceb9199 1518086366027862000
HEAD /abc/
Host: a.b.c
User-Agent: ...
...
Connection: Keep-Alive
注意:输入文件可能还包含 POST 请求的二进制数据(如 Content-Type: image/png
):
POST /...
Content-Length: 26892
-----------------------------19579713013480936471158807818
Content-Disposition: form-data; name="upload"; filename="__fileCreatedFromDataURI__.png"
Content-Type: image/png
<89>PNG
^Z
^@^@^@^MIHDR^@
...
这可能会中断处理...
是否可以使用 awk/sed 等工具一次性实现?或者它可能需要编写一个简单的脚本(Python for ex)?我想我可以将输入拆分为多个文件,但这会导致文件数量过多。
GNU awk
方法:
awk 'BEGIN{ RS=ORS="" }/Host: a.b.c/; END{ ORS=""; print }' file
输出:
1 3534a2e1d670c596a673a706c3031a6bec9d6b06 1518086364994132000
HEAD /abc/
Host: a.b.c
User-Agent: ...
...
Connection: Keep-Alive
1 06891fdbebd48cb23ffe6ed5964c3fadcceb9199 1518086366027862000
HEAD /abc/
Host: a.b.c
User-Agent: ...
...
Connection: Keep-Alive