如何在 Bash 中为 GNU Parallel 创建 Stack 或 LIFO

How do I create a Stack or LIFO for GNU Parallel in Bash

虽然我的原始问题以不同的方式解决了(请参阅此问题下的评论线程,以及对此问题的编辑),但我能够创建一个 stack/LIFO Bash 中的 GNU 并行。所以我将编辑我的 background/question 以反映可能需要它的情况。

背景

我正在使用 GNU Parallel 通过 Bash 脚本处理文件。随着文件的处理,将创建更多文件,并且需要将新命令添加到并行列表中。我无法并行提供完整的命令列表,因为信息是在处理初始文件时生成的。

我需要一种方法来将行添加到 parallel 的列表中,而它是 运行。

如果队列中没有任何内容,Parallel 也需要等待换行,并在队列完成后退出。

解决方案

首先我创建了一个fifo:

mkfifo /tmp/fifo

接下来我创建了一个 bash 文件,cat 是该文件并将输出通过管道传输到并行,它检查 end_of_file 行。 (我在接受的答案以及 here 的帮助下写了这篇文章)

#!/bin/bash
while true;
do
cat /tmp/fifo
done | parallel --ungroup --gnu --eof "end_of_file" "{}"

然后我用这个命令写入管道,向并行队列添加行:

echo "command here" > /tmp/fifo

使用此设置,所有新命令都会添加到队列中。 队列满后 parallel 将开始处理它。这意味着如果您有 32 个作业的槽(32 个处理器),那么您将需要添加 32 个作业才能启动队列。

如果 parallel 占用了它的所有处理器,它将暂停作业,直到有可用的处理器。

通过使用 --ungroup 参数,一旦队列已满,并行将 process/output 作业添加到队列中。

如果没有 --ungroup 参数,并行等待直到需要新的插槽来完成作业。来自接受的答案:

Output from the running or completed jobs are held back and will only be printed when JobSlots more jobs has been started (unless you use --ungroup or -u, in which case the output from the jobs are printed immediately). E.g. if you have 10 jobslots then the output from the first completed job will only be printed when job 11 has started, and the output of second completed job will only be printed when job 12 has started.

来自http://www.gnu.org/software/parallel/man.html#EXAMPLE:-GNU-Parallel-as-queue-system-batch-manager

There is a a small issue when using GNU parallel as queue system/batch manager: You have to submit JobSlot number of jobs before they will start, and after that you can submit one at a time, and job will start immediately if free slots are available. Output from the running or completed jobs are held back and will only be printed when JobSlots more jobs has been started (unless you use --ungroup or -u, in which case the output from the jobs are printed immediately). E.g. if you have 10 jobslots then the output from the first completed job will only be printed when job 11 has started, and the output of second completed job will only be printed when job 12 has started.