排除 GNU 并行中的某些模式

exclude some pattern in GNU parallel

我想用parallel实现一个目录下的一些文件,

现在我有一些任务,

1、我想跳过一些文件,例如

parallel -j 16 'zcat {} > {.}.unpacked' ::: *.gz

但是为此,我想在运行此命令时排除一些具有某种模式的文件。我该如何实施?

2、当某些作业在操作文件时报错退出,如何跳过该状态继续操作其他文件?

您对要排除的内容有点含糊,但说您要处理所有 gzip 文件,但以字母 a:

开头的文件除外
find -maxdepth 1 -iname "*.gz" ! -iname "a*" -print0 | parallel -0 'zcat {} > {.}.unpacked'

关于你的第二个问题,GNU Parallel 的默认行为是在出错后继续,所以你不需要明确地做任何事情。如果要改,看--halt选项:

--halt now,fail=1 exit when the first job fails. Kill running jobs.

--halt soon,fail=3 exit when 3 jobs fail, but wait for running jobs to complete.

--halt soon,fail=3% exit when 3% of the jobs have failed, but wait for running jobs to complete.

--halt now,success=1 exit when a job succeeds. Kill running jobs.

--halt soon,success=3 exit when 3 jobs succeeds, but wait for running jobs to complete.

--halt now,success=3% exit when 3% of the jobs have succeeded. Kill running jobs.

--halt now,done=1 exit when one of the jobs finishes. Kill running jobs.

--halt soon,done=3 exit when 3 jobs finishes, but wait for running jobs to complete.

--halt now,done=3% exit when 3% of the jobs have finished. Kill running jobs.

如果不想用find,可以用skip():

parallel -j 16 'zcat {= /mypattern/ and skip() =} > {.}.unpacked' ::: *.gz

/mypattern/ 可以是任何 Perl 代码。