在目录中使用换行符递归搜索文本?
Search Text with Linebreaks recursiv in a directory?
我有很多看起来像这样的大日志文件:
DATETIME ["2015-03-03 21:52"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","AAA"]
POST ["POST_JSON","BBB","TEST1"]
DATETIME ["2015-03-03 21:53"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","CCC"]
POST ["POST_JSON","DDD","TEST2"]
DATETIME ["2015-03-03 21:54"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","AAA"]
POST ["POST_JSON","BBB","TEST3"]
DATETIME ["2015-03-03 21:55"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","AAA"]
POST ["POST_JSON","EEE","TEST4"]
我想搜索大约 2 个关键字(它们之间是换行符)。 GET 行中的一个特定词和 POST 行中的一个特定词。
我需要这样的东西:
grep "GET(.*)AAA(.*)POST(.*)BBB"
我搜索 的内容:AAA(在 GET 行中)&& BBB(在 POST 行中)
预期的结果:
POST ["POST_JSON","BBB","TEST1"]
POST ["POST_JSON","BBB","TEST3"]
用哪些简单的方法可以做到这一点?
grep
是您要搜索的命令
grep -rHn "GET.*KEYWORD_A" -A1 /path/to/files | grep "POST.*KEYWORD_B"
我会首先 grep 查找包含 KEYWORD_A
的行,并在匹配后追加一行,因为 POST 出现在日志文件中的 GET 之后。然后搜索 KEYWORD_B
-r greps recursively in a directory
-H prints the file name
-n prints the line number
对第三个参数使用 GNU awk 来匹配():
$ find . -type f |
xargs gawk -v RS= 'match([=10=],/\nGET.*AAA.*\n(POST.*BBB.*)/,a){print a[1]}'
POST ["POST_JSON","BBB","TEST1"]
POST ["POST_JSON","BBB","TEST3"]
如果你真的想要输出行之间有一个空行,请添加 -v ORS='\n\n'
。
我用正则表达式的 grep -P 解决了这个问题,正如我从 PHP 知道的那样,特别是用 -A 来获取下 n 行。然后我用“|”过滤结果再次 grep -P
我有很多看起来像这样的大日志文件:
DATETIME ["2015-03-03 21:52"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","AAA"]
POST ["POST_JSON","BBB","TEST1"]
DATETIME ["2015-03-03 21:53"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","CCC"]
POST ["POST_JSON","DDD","TEST2"]
DATETIME ["2015-03-03 21:54"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","AAA"]
POST ["POST_JSON","BBB","TEST3"]
DATETIME ["2015-03-03 21:55"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","AAA"]
POST ["POST_JSON","EEE","TEST4"]
我想搜索大约 2 个关键字(它们之间是换行符)。 GET 行中的一个特定词和 POST 行中的一个特定词。
我需要这样的东西:
grep "GET(.*)AAA(.*)POST(.*)BBB"
我搜索 的内容:AAA(在 GET 行中)&& BBB(在 POST 行中)
预期的结果:
POST ["POST_JSON","BBB","TEST1"]
POST ["POST_JSON","BBB","TEST3"]
用哪些简单的方法可以做到这一点?
grep
是您要搜索的命令
grep -rHn "GET.*KEYWORD_A" -A1 /path/to/files | grep "POST.*KEYWORD_B"
我会首先 grep 查找包含 KEYWORD_A
的行,并在匹配后追加一行,因为 POST 出现在日志文件中的 GET 之后。然后搜索 KEYWORD_B
-r greps recursively in a directory
-H prints the file name
-n prints the line number
对第三个参数使用 GNU awk 来匹配():
$ find . -type f |
xargs gawk -v RS= 'match([=10=],/\nGET.*AAA.*\n(POST.*BBB.*)/,a){print a[1]}'
POST ["POST_JSON","BBB","TEST1"]
POST ["POST_JSON","BBB","TEST3"]
如果你真的想要输出行之间有一个空行,请添加 -v ORS='\n\n'
。
我用正则表达式的 grep -P 解决了这个问题,正如我从 PHP 知道的那样,特别是用 -A 来获取下 n 行。然后我用“|”过滤结果再次 grep -P