GNU Parallel：运行 bash 从管道读取（序列号）的代码？

Question

我想并行读取 (seq numbers) 管道，所以我想要运行类似的东西：

seq 2000 | parallel --max-args 0 --jobs 10 "{ read test; echo $test; }"

相当于运行:

echo 1
echo 2
echo 3
echo 4
...
echo 2000

但不幸的是，管道不是并行读取的，这意味着它运行像：

echo
echo
echo
...
echo

并且输出为空。

有谁知道如何制作并行读取（序列号）管道？谢谢。

Answer 1

不需要 GNU parallel 的 GNU xargs 替代方案：

seq 2000 | xargs -P 10 -I {} "echo" "hello world {}"

输出：

hello world 1
hello world 2
hello world 3
hello world 4
hello world 5
.
.
.

来自man xargs：

-P max-procs: Run up to max-procs processes at a time; the default is 1. If max-procs is 0, xargs will run as many processes as possible at a time.

-I replace-str: Replace occurrences of replace-str in the initial-arguments with names read from standard input.

Answer 2

使用 xargs 而不是并行 ，同时仍然使用 shell（而不是启动 /bin/echo 可执行文件的新副本运行) 的行看起来像：

seq 2000 | xargs -P 10 \
  sh -c 'for arg in "$@"; do echo "hello world $arg"; done' _

这可能比 Cyrus 的现有答案更快，因为启动可执行文件需要时间；即使启动 /bin/sh 的新副本比启动 /bin/echo 的副本花费更长的时间，因为这没有使用 -I {}，它能够将许多参数 传递给每个/bin/sh 的副本，从而将时间成本分摊到更多数字上；这样我们就可以使用 sh 中内置的 echo 的副本，而不是单独的 echo 可执行文件。

Answer 3

您希望将输入通过管道传输到您运行的命令中，因此请使用 --pipe:

seq 2000 |
   parallel --pipe -N1 --jobs 10 'read test; echo $test;'

但是如果你真的只需要它作为一个变量，我会做其中之一：

seq 2000 | parallel --jobs 10 echo
seq 2000 | parallel --jobs 10 echo {}
seq 2000 | parallel --jobs 10 'test={}; echo $test'

我鼓励您花 20 分钟阅读 https://doi.org/10.5281/zenodo.1146014 的第 1+2 章，您的命令行会因此爱上您。

GNU Parallel：运行 bash 从管道读取（序列号）的代码？

GNU Parallel: Run bash code that reads (seq number) from pipe?

bash

curl

gnu-parallel