将 Grep 与 Xargs 和进程替换一起使用
Using Grep with Xargs and Process Substitution
我正在尝试通过 xargs 传递 grep 查询,同时通过进程替换传递文件。
command1 | xargs -I{} grep {} <(command2)
制作虚拟文件
for f in {1..50}; do echo $f >> test50.txt; done
for f in {25..30}; do echo $f >> test5.txt; done
xargs 和进程替换为 grep
cat test5.txt | xargs -I{} grep {} <(cat test50.txt)
输出为:
25
期望的输出是:
25
26
27
28
29
30
我认为问题在于 grep 如何接收输入文件,它在一行后停止,而我希望它搜索整个输入文件
考虑一下
cat test5.txt | xargs -I{} echo {} <(cat test50.txt)
产生
25 /dev/fd/63
26 /dev/fd/63
27 /dev/fd/63
28 /dev/fd/63
29 /dev/fd/63
30 /dev/fd/63
因此这个
cat test5.txt | xargs -I{} cat {} <(cat test50.txt)
产出
cat: 25: No such file or directory
1
2
--cutted for brevity--
49
50
cat: 26: No such file or directory
cat: 27: No such file or directory
cat: 28: No such file or directory
cat: 29: No such file or directory
cat: 30: No such file or directory
您的问题不在于 grep
,而是 bash 中的 process substitution
。进程替换创建一个命名管道。接下来,来自该管道的所有数据都在第一次调用提供给xargs
的命令中被消耗(在你的例子中它是grep
,在我上面的echo
和 cat
),所以第一个参数是 25
。
这会起作用
cat test5.txt | xargs -I{} bash -c " grep {} <(cat test50.txt)"
因为它会为每个 grep
独立调用创建“新鲜”process substitution
。
不需要 xargs
,因为 grep
已经可以从文件中指定搜索词
$ seq 50 > f1
$ seq 25 30 > f2
$ grep -Fxf f2 f1
25
26
27
28
29
30
来自man grep
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings (instead of regular expressions), separated by newlines,
any of which is to be matched.
-x, --line-regexp
Select only those matches that exactly match the whole line. For a regular expression pattern, this is
like parenthesizing the pattern and then surrounding it with ^ and $.
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. If this option is used multiple times or is combined with the -e (--regexp) option, search for all patterns given. The empty file contains zero patterns, and therefore matches nothing.
GNU Parallel 看起来像这样:
cat test5.txt | parallel 'grep {} <(cat test50.txt)'
我正在尝试通过 xargs 传递 grep 查询,同时通过进程替换传递文件。
command1 | xargs -I{} grep {} <(command2)
制作虚拟文件
for f in {1..50}; do echo $f >> test50.txt; done
for f in {25..30}; do echo $f >> test5.txt; done
xargs 和进程替换为 grep
cat test5.txt | xargs -I{} grep {} <(cat test50.txt)
输出为:
25
期望的输出是:
25
26
27
28
29
30
我认为问题在于 grep 如何接收输入文件,它在一行后停止,而我希望它搜索整个输入文件
考虑一下
cat test5.txt | xargs -I{} echo {} <(cat test50.txt)
产生
25 /dev/fd/63
26 /dev/fd/63
27 /dev/fd/63
28 /dev/fd/63
29 /dev/fd/63
30 /dev/fd/63
因此这个
cat test5.txt | xargs -I{} cat {} <(cat test50.txt)
产出
cat: 25: No such file or directory
1
2
--cutted for brevity--
49
50
cat: 26: No such file or directory
cat: 27: No such file or directory
cat: 28: No such file or directory
cat: 29: No such file or directory
cat: 30: No such file or directory
您的问题不在于 grep
,而是 bash 中的 process substitution
。进程替换创建一个命名管道。接下来,来自该管道的所有数据都在第一次调用提供给xargs
的命令中被消耗(在你的例子中它是grep
,在我上面的echo
和 cat
),所以第一个参数是 25
。
这会起作用
cat test5.txt | xargs -I{} bash -c " grep {} <(cat test50.txt)"
因为它会为每个 grep
独立调用创建“新鲜”process substitution
。
不需要 xargs
,因为 grep
已经可以从文件中指定搜索词
$ seq 50 > f1
$ seq 25 30 > f2
$ grep -Fxf f2 f1
25
26
27
28
29
30
来自man grep
-F, --fixed-strings Interpret PATTERN as a list of fixed strings (instead of regular expressions), separated by newlines, any of which is to be matched.
-x, --line-regexp Select only those matches that exactly match the whole line. For a regular expression pattern, this is like parenthesizing the pattern and then surrounding it with ^ and $.
-f FILE, --file=FILE Obtain patterns from FILE, one per line. If this option is used multiple times or is combined with the -e (--regexp) option, search for all patterns given. The empty file contains zero patterns, and therefore matches nothing.
GNU Parallel 看起来像这样:
cat test5.txt | parallel 'grep {} <(cat test50.txt)'