运行 两个 pmap 在 Julia 中异步使用不同的 worker
Run two pmap with different workers asynchronously in Julia
我有一个与 pmap 一起使用的函数来并行化它。我想 运行 4 次这个函数异步使用 10 个工人,但我不能同时 运行 两个或更多 pmap。
我在 linux 上使用 Julia v1.1 和一台 40-CPU 机器。
using Distributed
addprocs(4)
@everywhere function TestParallel(x)
a = 0
while a < 4
println("Value = ",x, " in worker = ", myid())
sleep(1)
a += 1
end
end
a = WorkerPool([2,3])
b = WorkerPool([4,5])
c = [i for i = 1:10]
@sync @async for i in c
pmap(x-> TestParallel(x), a, c)
pmap(x-> TestParallel(x), b, c)
end
我希望有:
From worker 2: Value = 1 in worker = 2
From worker 3: Value = 2 in worker = 3
From worker 4: Value = 3 in worker = 4
From worker 5: Value = 4 in worker = 5
所以 c 的前两个元素转到第一个 pmap,接下来的两个元素转到第二个 pmap,然后谁先完成就得到接下来的两个元素。
现在我获得:
From worker 2: Value = 1 in worker = 2
From worker 3: Value = 2 in worker = 3
From worker 2: Value = 1 in worker = 2
From worker 3: Value = 2 in worker = 3
在第一个 pmap 完成 c 的所有元素后,第二个 pmap 重新开始解决所有元素。
From worker 2: Value = 9 in worker = 2
From worker 3: Value = 10 in worker = 3
From worker 5: Value = 2 in worker = 5
From worker 4: Value = 1 in worker = 4
您的问题存在一些问题:@sync
和 @async
使用绿色线程,您希望分配您的计算。语法 @sync @async [some code]
异步生成代码并等待它完成。因此实际上它与 [some code]
具有相同的含义。
虽然您的问题不清楚,但我假设您想使用单独的工作池并行启动 2 pmap
s(这似乎是您最有可能尝试做的事情)。
在这种情况下,代码如下:
using Distributed
addprocs(4)
@everywhere function testpar2(x)
for a in 0:3
println("Value = $x [$a] in worker = $(myid())")
sleep(0.2)
end
return 1000*myid()+x*x #I assume you want to return some value
end
a = WorkerPool([2,3])
b = WorkerPool([4,5])
c = collect(1:10)
@sync begin
@async begin
res1 = pmap(x-> testpar2(x), a, c)
println("Got res1=$res1")
end
@async begin
res2 = pmap(x-> testpar2(x), b, c)
println("Got res2=$res2")
end
end
当 运行 执行以上代码时,您将看到如下内容:
...
From worker 5: Value = 10 [3] in worker = 5
From worker 2: Value = 10 [3] in worker = 2
From worker 3: Value = 9 [3] in worker = 3
Got res2=[4001, 5004, 5009, 4016, 5025, 4036, 4049, 5064, 4081, 5100]
Got res1=[2001, 3004, 2009, 3016, 2025, 3036, 3049, 2064, 3081, 2100]
Task (done) @0x00000000134076b0
您可以清楚地看到两个 pmap 在不同的工作池上并行 运行。
我有一个与 pmap 一起使用的函数来并行化它。我想 运行 4 次这个函数异步使用 10 个工人,但我不能同时 运行 两个或更多 pmap。
我在 linux 上使用 Julia v1.1 和一台 40-CPU 机器。
using Distributed
addprocs(4)
@everywhere function TestParallel(x)
a = 0
while a < 4
println("Value = ",x, " in worker = ", myid())
sleep(1)
a += 1
end
end
a = WorkerPool([2,3])
b = WorkerPool([4,5])
c = [i for i = 1:10]
@sync @async for i in c
pmap(x-> TestParallel(x), a, c)
pmap(x-> TestParallel(x), b, c)
end
我希望有:
From worker 2: Value = 1 in worker = 2
From worker 3: Value = 2 in worker = 3
From worker 4: Value = 3 in worker = 4
From worker 5: Value = 4 in worker = 5
所以 c 的前两个元素转到第一个 pmap,接下来的两个元素转到第二个 pmap,然后谁先完成就得到接下来的两个元素。
现在我获得:
From worker 2: Value = 1 in worker = 2
From worker 3: Value = 2 in worker = 3
From worker 2: Value = 1 in worker = 2
From worker 3: Value = 2 in worker = 3
在第一个 pmap 完成 c 的所有元素后,第二个 pmap 重新开始解决所有元素。
From worker 2: Value = 9 in worker = 2
From worker 3: Value = 10 in worker = 3
From worker 5: Value = 2 in worker = 5
From worker 4: Value = 1 in worker = 4
您的问题存在一些问题:@sync
和 @async
使用绿色线程,您希望分配您的计算。语法 @sync @async [some code]
异步生成代码并等待它完成。因此实际上它与 [some code]
具有相同的含义。
虽然您的问题不清楚,但我假设您想使用单独的工作池并行启动 2 pmap
s(这似乎是您最有可能尝试做的事情)。
在这种情况下,代码如下:
using Distributed
addprocs(4)
@everywhere function testpar2(x)
for a in 0:3
println("Value = $x [$a] in worker = $(myid())")
sleep(0.2)
end
return 1000*myid()+x*x #I assume you want to return some value
end
a = WorkerPool([2,3])
b = WorkerPool([4,5])
c = collect(1:10)
@sync begin
@async begin
res1 = pmap(x-> testpar2(x), a, c)
println("Got res1=$res1")
end
@async begin
res2 = pmap(x-> testpar2(x), b, c)
println("Got res2=$res2")
end
end
当 运行 执行以上代码时,您将看到如下内容:
...
From worker 5: Value = 10 [3] in worker = 5
From worker 2: Value = 10 [3] in worker = 2
From worker 3: Value = 9 [3] in worker = 3
Got res2=[4001, 5004, 5009, 4016, 5025, 4036, 4049, 5064, 4081, 5100]
Got res1=[2001, 3004, 2009, 3016, 2025, 3036, 3049, 2064, 3081, 2100]
Task (done) @0x00000000134076b0
您可以清楚地看到两个 pmap 在不同的工作池上并行 运行。