运行串行内并行 Bash

Question

我对我的解释做了一些补充。从概念上讲，我正在运行设置一个循环处理的脚本，调用使用行内容作为输入参数的 shell。（仅供参考：a 开始执行，b 监视该执行）

我首先需要 1a 和 1b 到运行，前两个 $param
接下来，当第 1 步完成时，2a 和 2b 需要运行串行获取 $params
3a 和 3b 将在 2a 和 2b 完成后开始（串行或并行无关）
循环继续输入 .txt 的下两行

我无法让它串行处理第二个，只能并行处理：我需要的是以下内容

cat filename | while readline 
export param=$line
do
./script1a.sh "param" > process.lg && ./script2b.sh > monitor.log &&
##wait for processes to finish, running 2 in parallel in script1.sh
./script2a.sh "param" > process2.log && ./script2b.sh > minitor2.log &&
##run each of the 2 in serial for script2.sh
./script3a.sh && ./script3b.sh

我尝试在 wait 中添加，并尝试了一个包含 script2a.sh 和 script2b.sh 的 if 语句，这将运行串行，但无济于事。

if ((++i % 2 ==0)) then wait fi
done
#only run two lines at a time, then cycle back through loop

作为 script1 并行的结果，我究竟如何才能串行地获得 script2.sh 到运行？

Answer 1

正在锁定！

如果要并行化script1和script3，但需要对script2的所有调用进行序列化，继续使用：

./script1.sh && ./script2.sh && ./script3.sh &

...但修改 script2 以在它执行任何其他操作之前获取锁：

#!/bin/bash
exec 3>.lock2
flock -x 3
# ... continue with script2's business here.

请注意，您不得删除此处使用的.lock2文件，否则可能会导致多个进程认为它们同时持有锁。

Answer 2

我不是 100% 确定你的问题是什么意思，但现在我认为你在你的内部循环中的意思是这样的：

(
   # run script1 and script2 in parallel
   script1 &
   s1pid=$!

   # start no more than one script2 using GNU Parallel as a mutex
   sem --fg script2

   # when they are both done...
   wait $s1pid

   # run script3
   script3

) &    # and do that lot in parallel with previous/next loop iteration

Answer 3

您没有向我们展示您从文件中读取的行是如何被消耗的。
如果我对你的问题理解正确，你想运行 script1 在两行 filename 上，每行并行，然后串行运行 script2 当两者都是完成了吗？

while read first; do
    echo "$first" | ./script1.sh &
    read second
    echo "$second" | ./script1.sh &
    wait
    script2.sh &    # optionally don't background here?
    script3.sh
done <filename &

while 循环包含两个 read 语句，因此每次迭代从 filename 读取两行并将每行提供给 script1 的单独实例。然后我们 wait 直到它们都在我们运行 script2 之前完成。我将它作为背景，以便 script3 可以在它运行时启动，并作为整个 while 循环的背景；但是默认情况下您实际上可能不需要将整个作业作为后台（如果您将其编写为常规的前台作业，开发会容易得多，然后当它工作时，如果需要的话，在启动它时将整个作业作为后台）。

根据您实际希望数据流动的方式，我可以想到很多变体；这是对您最近更新的问题的回复。

export param  # is this really necessary?
while read param; do
    # First instance
    ./script1a.sh "$param" > process.lg  &&
    ./script2b.sh > monitor.log &

    # Second instance
    read param
    ./script2a.sh "$param" > process2.log && ./script2b.sh > minitor2.log &

    # Wait for both to finish
    wait

    ./script3a.sh && ./script3b.sh
done <filename

如果这仍然没有帮助，也许你应该 post 一个 第三个 问题，你真正解释你想要什么...

Answer 4

我理解你的问题是这样的：

您有一个模型列表。这些模型需要运行。在他们运行之后，他们必须被转移。简单的解决方案是：

run_model model1
transfer_result model1
run_model model2
transfer_result model2

但是为了加快速度，我们想要并行化部分。不幸的是 transfer_result 无法并行化。

run_model model1
run_model model2
transfer_result model1
transfer_result model2

model1 和 model2 是从文本文件中读取的。 run_model 可以是运行并联，而您希望其中 2 个运行并联。 transfer_result一次只能运行一个，计算出来的结果才可以传输。

可以这样做：

cat models.txt | parallel -j2 'run_model {} && sem --id transfer transfer_model {}'

run_model {} && sem --id transfer transfer_model {} 将运行一个模型，如果成功则转移它。只有在没有其他传输运行ning.

时才会开始传输

parallel -j2 将运行这些作业中的两个并行。

如果传输时间比计算模型的时间短，那么您应该不会感到惊讶：传输最多将与下一次传输交换。如果传输时间超过运行模型传输时间，您可能会看到模型完全无序传输（例如，您可能会在传输作业 2 之前看到传输作业 10）。但他们最终会全部转移。

你可以看到用这个举例说明的执行顺序：

seq 10 | parallel -uj2 'echo ran model {} && sem --id transfer "sleep .{};echo transferred {}"'

此解决方案优于基于 wait 的解决方案，因为您可以在传输模型 1+2 的同时运行模型 3。

Answer 5

@tripleee 如果有兴趣，我将以下内容放在一起（注意：我为 post 更改了一些变量，所以很抱歉，如果任何地方存在不一致......出口也有其原因。我认为有一个比导出更好的方法，但现在它有效）

cat input.txt | while read first; do
export step=${first//\"/}
export stepem=EM_${step//,/_}
export steptd=TD_${step//,/_}
export stepeg=EG_${step//,/_}
echo "$step" |  $directory"/ws_client.sh" processOptions  "$appName" "$step" "$layers" "$stages" "" "$stages" "$stages" FALSE > "$Folder""/""$stepem""_ProcessID.log" &&
$dir_model"/check_ status.sh" "$Folder" "$stepem" > "$Folder""/""$stepem""_Monitor.log" &
read second
export step2=${second//\"/}
export stepem2=ExecuteModel_${step2//,/_}
export steptd2=TransferData_${step2//,/_}
export stepeg2=ExecuteGeneology_${step2//,/_}
echo "$step2" |  $directory"/ws_client.sh" processOptions  "$appName" "$step2" "$layers" "$stages" "" "$stages" "$stages" FALSE > "$Folder""/""$stepem2""_ProcessID.log" && 
$dir _model"/check _status.sh" "$Folder" "$stepem2" > "$Folder""/""$stepem2""_Monitor.log" &
wait
$directory"/ws_client.sh" processOptions "$appName" "$step" "$layers" "" ""  "$stage_final" "" TRUE > "$appLogFolder""/""$steptd""_ProcessID.log" &&
$dir _model"/check_status.sh" "$Folder" "$steptd" > "$Folder""/""$steptd""_Monitor.log" &&
$directory"/ws_client.sh" processOptions "$appName" "$step2" "$layers" "" ""  "$stage_final" "" TRUE > "$appLogFolder""/""$steptd2""_ProcessID.log" &&
$dir _model"/check _status.sh" "$Folder" "$steptd2" > "$Folder""/""$steptd2""_Monitor.log" &
wait
$directory"/ws_client.sh" processPaths "$appName" "$step" "$layers" "$genPath_01" > "$appLogFolder""/""$stepeg""_ProcessID.log" &&
$dir _model"/check _status.sh" "$Folder" "$stepeg" > "$Folder""/""$stepeg""_Monitor.log" &&
$directory"/ws_client.sh" processPaths "$appName" "$step2" "$layers" "$genPath_01" > "$appLogFolder""/""$stepeg2""_ProcessID.log" &&
$dir_model"/check _status.sh" "$Folder" "$stepeg2" > "$Folder""/""$stepeg2""_Monitor.log" &
wait
 if (( ++i % 2 == 0))
then
echo "Waiting..."
wait
fi

运行串行内并行 Bash

Run Serial inside Paralell Bash

linux

parallel-processing

bash

while-loop

wait

运行 串行内并行 Bash

Run Serial inside Paralell Bash

linux

parallel-processing

bash

while-loop

wait

运行串行内并行 Bash