如何使用 Dask 并行化循环?
How to parallelize a loop with Dask?
我觉得 Dask documentation 很混乱。假设我有一个函数:
import random
import dask
def my_function(arg1, arg2, arg3):
val = random.uniform(arg1, arg2)
va2 = random.uniform(arg2, arg3)
return val1 + val2
some_list = []
for i in range(100):
some_num = dask.delayed(my_function)(arg1, arg2, arg3)
some_list += [some_num]
computed_list = dask.compute(*some_list)
这件事会失败,因为my_function()
没有得到所有 3 个参数。
如何在 dask
中并行化这段代码?
编辑:
如果你在函数 def
之上放置一个 @dask.delayed
装饰器并正常调用它似乎可以工作,但现在 .compute()
-方法行抛出:
KilledWorker: ('my_function-ac3c88f1-53f8-4d36-a520-ff8c40c6ee61', <Worker 'tcp://127.0.0.1:35925', name: 1, memory: 0, processing: 10>)
我先构建一个图,然后对其调用计算:
import random
import dask
@dask.delayed
def my_function(arg1, arg2, arg3):
val1 = random.uniform(arg1, arg2)
val2 = random.uniform(arg2, arg3)
return val1 + val2
arg1 = 1
arg2 = 2
arg3 = 3
some_list = []
for i in range(10):
some_num = my_function(arg1, arg2, arg3)
some_list.append(some_num)
graph = dask.delayed()(some_list)
# graph.visualize()
computed_list = graph.compute()
我觉得 Dask documentation 很混乱。假设我有一个函数:
import random
import dask
def my_function(arg1, arg2, arg3):
val = random.uniform(arg1, arg2)
va2 = random.uniform(arg2, arg3)
return val1 + val2
some_list = []
for i in range(100):
some_num = dask.delayed(my_function)(arg1, arg2, arg3)
some_list += [some_num]
computed_list = dask.compute(*some_list)
这件事会失败,因为my_function()
没有得到所有 3 个参数。
如何在 dask
中并行化这段代码?
编辑:
如果你在函数 def
之上放置一个 @dask.delayed
装饰器并正常调用它似乎可以工作,但现在 .compute()
-方法行抛出:
KilledWorker: ('my_function-ac3c88f1-53f8-4d36-a520-ff8c40c6ee61', <Worker 'tcp://127.0.0.1:35925', name: 1, memory: 0, processing: 10>)
我先构建一个图,然后对其调用计算:
import random
import dask
@dask.delayed
def my_function(arg1, arg2, arg3):
val1 = random.uniform(arg1, arg2)
val2 = random.uniform(arg2, arg3)
return val1 + val2
arg1 = 1
arg2 = 2
arg3 = 3
some_list = []
for i in range(10):
some_num = my_function(arg1, arg2, arg3)
some_list.append(some_num)
graph = dask.delayed()(some_list)
# graph.visualize()
computed_list = graph.compute()