Julia - @spawn 按顺序而不是并行计算作业
Julia - @spawn computing jobs sequentially instead of parallel
我正在尝试 运行 使用 @spawn
宏在 Julia(版本 1.1.0)中并行执行一个函数。
我注意到使用 @spawn
作业实际上是按顺序执行的(尽管来自不同的工人)。
使用并行计算作业的 [pmap][1] 函数时不会发生这种情况。
以下是 main.jl
程序的代码,该程序调用应执行的函数(在模块 hello_module
中):
#### MAIN START ####
# deploy the workers
addprocs(4)
# load modules with multi-core functions
@everywhere include(joinpath(dirname(@__FILE__), "hello_module.jl"))
# number of cores
cpus = nworkers()
# print hello world in parallel
hello_module.parallel_hello_world(cpus)
[1]: https://docs.julialang.org/en/v1/stdlib/Distributed/#Distributed.pmap
...这是模块的代码:
module hello_module
using Distributed
using Printf: @printf
using Base
"""Print Hello World on STDOUT"""
function hello_world()
println("Hello World!")
end
"""Print Hello World in Parallel."""
function parallel_hello_world(threads::Int)
# create array with as many elements as the threads
a = [x for x=1:threads]
#= This would perform the computation in parallel
wp = WorkerPool(workers())
c = pmap(hello_world, wp, a, distributed=true)
=#
# spawn the jobs
for t in a
r = @spawn hello_world()
# @show r
s = fetch(r)
end
end
end # module end
您需要使用绿色线程来管理并行度。
在 Julia 中,它是通过使用 @sync
和 @async
宏来实现的。
请参阅下面的最小工作示例:
using Distributed
addprocs(3)
@everywhere using Dates
@everywhere function f()
println("starting at $(myid()) time $(now()) ")
sleep(1)
println("finishing at $(myid()) time $(now()) ")
return myid()^3
end
function test()
fs = Dict{Int,Future}()
@sync for w in workers()
@async fs[w] = @spawnat w f()
end
res = Dict{Int,Int}()
@sync for w in workers()
@async res[w] = fetch(fs[w])
end
res
end
这里的输出清楚地表明函数正在 运行 并行:
julia> test()
From worker 3: starting at 3 time 2019-04-02T01:18:48.411
From worker 2: starting at 2 time 2019-04-02T01:18:48.411
From worker 4: starting at 4 time 2019-04-02T01:18:48.415
From worker 2: finishing at 2 time 2019-04-02T01:18:49.414
From worker 3: finishing at 3 time 2019-04-02T01:18:49.414
From worker 4: finishing at 4 time 2019-04-02T01:18:49.418
Dict{Int64,Int64} with 3 entries:
4 => 64
2 => 8
3 => 27
编辑:
我建议您管理计算的分配方式。但是,您也可以使用 @spawn
。请注意,在下面的场景中,作业同时分配给了工人。
function test(N::Int)
fs = Dict{Int,Future}()
@sync for task in 1:N
@async fs[task] = @spawn f()
end
res = Dict{Int,Int}()
@sync for task in 1:N
@async res[task] = fetch(fs[task])
end
res
end
这是输出:
julia> test(6)
From worker 2: starting at 2 time 2019-04-02T10:03:07.332
From worker 2: starting at 2 time 2019-04-02T10:03:07.34
From worker 3: starting at 3 time 2019-04-02T10:03:07.332
From worker 3: starting at 3 time 2019-04-02T10:03:07.34
From worker 4: starting at 4 time 2019-04-02T10:03:07.332
From worker 4: starting at 4 time 2019-04-02T10:03:07.34
From worker 4: finishin at 4 time 2019-04-02T10:03:08.348
From worker 2: finishin at 2 time 2019-04-02T10:03:08.348
From worker 3: finishin at 3 time 2019-04-02T10:03:08.348
From worker 3: finishin at 3 time 2019-04-02T10:03:08.348
From worker 4: finishin at 4 time 2019-04-02T10:03:08.348
From worker 2: finishin at 2 time 2019-04-02T10:03:08.348
Dict{Int64,Int64} with 6 entries:
4 => 8
2 => 27
3 => 64
5 => 27
6 => 64
1 => 8
我正在尝试 运行 使用 @spawn
宏在 Julia(版本 1.1.0)中并行执行一个函数。
我注意到使用 @spawn
作业实际上是按顺序执行的(尽管来自不同的工人)。
使用并行计算作业的 [pmap][1] 函数时不会发生这种情况。
以下是 main.jl
程序的代码,该程序调用应执行的函数(在模块 hello_module
中):
#### MAIN START ####
# deploy the workers
addprocs(4)
# load modules with multi-core functions
@everywhere include(joinpath(dirname(@__FILE__), "hello_module.jl"))
# number of cores
cpus = nworkers()
# print hello world in parallel
hello_module.parallel_hello_world(cpus)
[1]: https://docs.julialang.org/en/v1/stdlib/Distributed/#Distributed.pmap
...这是模块的代码:
module hello_module
using Distributed
using Printf: @printf
using Base
"""Print Hello World on STDOUT"""
function hello_world()
println("Hello World!")
end
"""Print Hello World in Parallel."""
function parallel_hello_world(threads::Int)
# create array with as many elements as the threads
a = [x for x=1:threads]
#= This would perform the computation in parallel
wp = WorkerPool(workers())
c = pmap(hello_world, wp, a, distributed=true)
=#
# spawn the jobs
for t in a
r = @spawn hello_world()
# @show r
s = fetch(r)
end
end
end # module end
您需要使用绿色线程来管理并行度。
在 Julia 中,它是通过使用 @sync
和 @async
宏来实现的。
请参阅下面的最小工作示例:
using Distributed
addprocs(3)
@everywhere using Dates
@everywhere function f()
println("starting at $(myid()) time $(now()) ")
sleep(1)
println("finishing at $(myid()) time $(now()) ")
return myid()^3
end
function test()
fs = Dict{Int,Future}()
@sync for w in workers()
@async fs[w] = @spawnat w f()
end
res = Dict{Int,Int}()
@sync for w in workers()
@async res[w] = fetch(fs[w])
end
res
end
这里的输出清楚地表明函数正在 运行 并行:
julia> test()
From worker 3: starting at 3 time 2019-04-02T01:18:48.411
From worker 2: starting at 2 time 2019-04-02T01:18:48.411
From worker 4: starting at 4 time 2019-04-02T01:18:48.415
From worker 2: finishing at 2 time 2019-04-02T01:18:49.414
From worker 3: finishing at 3 time 2019-04-02T01:18:49.414
From worker 4: finishing at 4 time 2019-04-02T01:18:49.418
Dict{Int64,Int64} with 3 entries:
4 => 64
2 => 8
3 => 27
编辑:
我建议您管理计算的分配方式。但是,您也可以使用 @spawn
。请注意,在下面的场景中,作业同时分配给了工人。
function test(N::Int)
fs = Dict{Int,Future}()
@sync for task in 1:N
@async fs[task] = @spawn f()
end
res = Dict{Int,Int}()
@sync for task in 1:N
@async res[task] = fetch(fs[task])
end
res
end
这是输出:
julia> test(6)
From worker 2: starting at 2 time 2019-04-02T10:03:07.332
From worker 2: starting at 2 time 2019-04-02T10:03:07.34
From worker 3: starting at 3 time 2019-04-02T10:03:07.332
From worker 3: starting at 3 time 2019-04-02T10:03:07.34
From worker 4: starting at 4 time 2019-04-02T10:03:07.332
From worker 4: starting at 4 time 2019-04-02T10:03:07.34
From worker 4: finishin at 4 time 2019-04-02T10:03:08.348
From worker 2: finishin at 2 time 2019-04-02T10:03:08.348
From worker 3: finishin at 3 time 2019-04-02T10:03:08.348
From worker 3: finishin at 3 time 2019-04-02T10:03:08.348
From worker 4: finishin at 4 time 2019-04-02T10:03:08.348
From worker 2: finishin at 2 time 2019-04-02T10:03:08.348
Dict{Int64,Int64} with 6 entries:
4 => 8
2 => 27
3 => 64
5 => 27
6 => 64
1 => 8