如何在 xargs 中使用替换?

How to use substitution in xargs?

我想做的是

  1. 查找所有扩展名为 .txt 的文件
  2. 将它们复制到 .dat 文件

它可以这样做:

for f in `find . -type f -name "*.txt"`; do cp $f ${f%.txt}.dat; done

我想用 xargs 做这个,我试过这个:

find . -type f -name "*.txt" | xargs -i cp {} ${{}%.txt}.dat

我的错误是这样的:

bad substitution

对此,我有以下疑问:

  1. 如何正确进行替换?
  2. 我很好奇 xargs 会在 for loop 一个接一个地做事的时候并行做事吗?

您可以使用:

find . -type f -name "*.txt" -print0 |
xargs -0 -i bash -c 'echo cp "" "${1%.txt}.dat"' - '{}'
  1. How to do the substitution rightly?

您不能按照您尝试的方式使用替换,因为 {} 不是 bash 变量(只是 xargs 语法的一部分),因此 bash 不能对它。

更好的方法是创建一个完整的 bash 命令并将其作为 xargs 的参数提供(例如 xargs -0 -i bash -c 'echo cp "" "${1%.txt}.dat"' - '{}' - 这样您 可以 做 bash 替换)。

  1. I am curious about that xargs will do things parallel when for loop do things one by one?

是的,for 循环将按顺序执行操作,但默认情况下 xargs 始终如此。但是,您可以使用 xargs-P 选项来并行化它,来自 xargs 手册页:

   -P max-procs, --max-procs=max-procs
          Run up to max-procs processes at a time; the default is 1.  If max-procs is 0, xargs will run as many processes as possible at a time.  Use the -n option or the -L  option
          with  -P;  otherwise  chances are that only one exec will be done.  While xargs is running, you can send its process a

SIGUSR1 signal to increase the number of commands to run simultaneously, or a SIGUSR2 to decrease the number. You cannot increase it above an implementation-defined limit (which is shown with --show-limits). You cannot de‐ crease it below 1. xargs never terminates its commands; when asked to decrease, it merely waits for more than one existing command to terminate before starting another.

Please  note that it is up to the called processes to properly manage parallel access to shared resources.  For example, if

more than one of them tries to print to stdout, the ouptut will be produced in an indeterminate order (and very likely mixed up) unless the processes collaborate in some way to prevent this. Using some kind of locking scheme is one way to prevent such problems. In general, using a locking scheme will help ensure correct output but reduce performance. If you don't want to tolerate the performance difference, simply arrange for each process to produce a separate output file (or otherwise use separate resources).

如果您对 bash -c '...' - 构造不满意,可以改用 GNU Parallel:

find . -type f -name "*.txt" -print0 | parallel -0 cp {} {.}.dat

xargs 和其他工具在处理此类问题时不如 Perl 灵活。

~ ❱ find . | perl -lne '-f && ($old=$_) && s/\.txt/.dat/g && print "$old => $_"'
./dir/00.file.txt => ./dir/00.file.dat
./dir/06.file.txt => ./dir/06.file.dat
./dir/05.file.txt => ./dir/05.file.dat
./dir/02.file.txt => ./dir/02.file.dat
./dir/08.file.txt => ./dir/08.file.dat
./dir/07.file.txt => ./dir/07.file.dat
./dir/01.file.txt => ./dir/01.file.dat
./dir/04.file.txt => ./dir/04.file.dat
./dir/03.file.txt => ./dir/03.file.dat
./dir/09.file.txt => ./dir/09.file.dat

然后代替 print 函数使用:rename $old, $_

有了这个 one-liner 你可以重命名任何你喜欢的东西


为了在并行模式下强制使用 xargs,您应该使用 -P,例如:

ls *.mp4 | xargs -I xxx -P 0 ffmpeg -i xxx xxx.mp3

正在将所有 .mp4 文件并行转换为 .mp3。所以如果你有 10 mp4 那么 10 ffmpeg 同时是 运行。