如何从 CuPy 数组创建一个 dask 数组?
How to create a dask-array from CuPy array?
我正在尝试使用大量数据启动 dask.cluster.Kmeans
。
使用 CPU 是可以的,因为我用 dask.array
包装了 numpy
数组。
由于 cupy
.
中未实现的功能,似乎无法使用 GPU
我尝试重现 Mattew Rocklin 示例 (https://blog.dask.org/2019/01/03/dask-array-gpus-first-steps) 从 CuPy 随机生成器生成随机 dask 数组 - 它有效,但我不想使用这种情况。
用 dask.array
包裹 cupy
- 不起作用。
>>> import dask.array as da
>>> import cupy as cp
>>> da.from_array(cp.arange(100000)).sum().compute()
我希望得到这个数组的总和,但出现以下错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/base.py", line 175, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/base.py", line 446, in compute
results = schedule(dsk, keys, **kwargs)
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/threaded.py", line 82, in get
**kwargs
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/local.py", line 491, in get_async
raise_exception(exc, tb)
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/compatibility.py", line 130, in reraise
raise exc
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/local.py", line 233, in execute_task
result = _execute_task(task, data)
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/core.py", line 119, in _execute_task
return func(*args2)
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/array/core.py", line 100, in getter
c = np.asarray(c)
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/numpy/core/numeric.py", line 538, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: object __array__ method not producing an array
那么我如何通过 dask 数组管理 CuPy 的工作?
从 CuPy 数组创建 Dask 数组时,您需要提供 da.from_array
关键字参数 asarray=False
。所以您的代码将如下所示。
>>> import dask.array as da
>>> import cupy as cp
>>> da.from_array(cp.arange(100000), asarray=False).sum().compute()
我正在尝试使用大量数据启动 dask.cluster.Kmeans
。
使用 CPU 是可以的,因为我用 dask.array
包装了 numpy
数组。
由于 cupy
.
我尝试重现 Mattew Rocklin 示例 (https://blog.dask.org/2019/01/03/dask-array-gpus-first-steps) 从 CuPy 随机生成器生成随机 dask 数组 - 它有效,但我不想使用这种情况。
用 dask.array
包裹 cupy
- 不起作用。
>>> import dask.array as da
>>> import cupy as cp
>>> da.from_array(cp.arange(100000)).sum().compute()
我希望得到这个数组的总和,但出现以下错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/base.py", line 175, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/base.py", line 446, in compute
results = schedule(dsk, keys, **kwargs)
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/threaded.py", line 82, in get
**kwargs
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/local.py", line 491, in get_async
raise_exception(exc, tb)
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/compatibility.py", line 130, in reraise
raise exc
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/local.py", line 233, in execute_task
result = _execute_task(task, data)
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/core.py", line 119, in _execute_task
return func(*args2)
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/dask/array/core.py", line 100, in getter
c = np.asarray(c)
File "/home/ubuntu/miniconda3/envs/cupy/lib/python3.6/site-packages/numpy/core/numeric.py", line 538, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: object __array__ method not producing an array
那么我如何通过 dask 数组管理 CuPy 的工作?
从 CuPy 数组创建 Dask 数组时,您需要提供 da.from_array
关键字参数 asarray=False
。所以您的代码将如下所示。
>>> import dask.array as da
>>> import cupy as cp
>>> da.from_array(cp.arange(100000), asarray=False).sum().compute()