我可以创建一个延迟形状的 dask 数组吗
Can I create a dask array with a delayed shape
是否可以通过使用其他延迟值指定形状来从延迟值创建 dask 数组?
我的算法直到计算很晚才给出数组的形状。
最终,我将创建一些块,其形状由我计算的中间结果指定,最终对所有结果调用 da.concatenate
(好吧 da.block
如果它更灵活)
我不认为如果我不能这样做太有害,但如果可以的话会很酷。
示例代码
from dask import delayed
from dask import array as da
import numpy as np
n_shape = (3, 3)
shape = delayed(n_shape, nout=2)
d_shape = (delayed(n_shape[0]), delayed(n_shape[1]))
n = delayed(np.zeros)(n_shape, dtype=np.float)
# this doesn't work
# da.from_delayed(n, shape=shape, dtype=np.float)
# this doesn't work either, but I think goes a little deeper
# into the function call
da.from_delayed(n, shape=d_shape, dtype=np.float)
您不能提供延迟的形状,但您可以在不知道尺寸的地方使用 np.nan
作为值声明形状未知
示例
import random
import numpy as np
import dask
import dask.array as da
@dask.delayed
def f():
return np.ones((5, random.randint(10, 20))) # a 5 x ? array
values = [f() for _ in range(5)]
arrays = [da.from_delayed(v, shape=(5, np.nan), dtype=float) for v in values]
x = da.concatenate(arrays, axis=1)
>>> x
dask.array<concatenate, shape=(5, nan), dtype=float64, chunksize=(5, nan)>
>>> x.shape
(5, np.nan)
>>> x.compute().shape
(5, 88)
文档
见http://dask.pydata.org/en/latest/array-chunks.html#unknown-chunks
是否可以通过使用其他延迟值指定形状来从延迟值创建 dask 数组?
我的算法直到计算很晚才给出数组的形状。
最终,我将创建一些块,其形状由我计算的中间结果指定,最终对所有结果调用 da.concatenate
(好吧 da.block
如果它更灵活)
我不认为如果我不能这样做太有害,但如果可以的话会很酷。
示例代码
from dask import delayed
from dask import array as da
import numpy as np
n_shape = (3, 3)
shape = delayed(n_shape, nout=2)
d_shape = (delayed(n_shape[0]), delayed(n_shape[1]))
n = delayed(np.zeros)(n_shape, dtype=np.float)
# this doesn't work
# da.from_delayed(n, shape=shape, dtype=np.float)
# this doesn't work either, but I think goes a little deeper
# into the function call
da.from_delayed(n, shape=d_shape, dtype=np.float)
您不能提供延迟的形状,但您可以在不知道尺寸的地方使用 np.nan
作为值声明形状未知
示例
import random
import numpy as np
import dask
import dask.array as da
@dask.delayed
def f():
return np.ones((5, random.randint(10, 20))) # a 5 x ? array
values = [f() for _ in range(5)]
arrays = [da.from_delayed(v, shape=(5, np.nan), dtype=float) for v in values]
x = da.concatenate(arrays, axis=1)
>>> x
dask.array<concatenate, shape=(5, nan), dtype=float64, chunksize=(5, nan)>
>>> x.shape
(5, np.nan)
>>> x.compute().shape
(5, 88)
文档
见http://dask.pydata.org/en/latest/array-chunks.html#unknown-chunks