CUML 拟合函数抛出 cp.full TypeError

CUML fit functions throwing cp.full TypeError

我一直在尝试 运行 在 Google Colab pro 上安装 RAPIDS,并且已经成功安装了 cuml 和 cudf 包,但是我无法 运行 甚至示例脚本.

TLDR;

每当我尝试在 Google Colab 上 运行 cuml 的拟合函数时,我都会收到以下错误。我在使用演示示例进行安装和 cuml 时得到了这个。这发生在一系列 cuml 示例中(我第一次尝试 运行 UMAP)。

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-c06fc2c31ca3> in <module>()
     13 knn.fit(X_train, y_train)
     14 
---> 15 knn.predict(X_test)

5 frames
cuml/neighbors/kneighbors_regressor.pyx in cuml.neighbors.kneighbors_regressor.KNeighborsRegressor.predict()

cuml/neighbors/nearest_neighbors.pyx in cuml.neighbors.nearest_neighbors.NearestNeighbors.kneighbors()

cuml/neighbors/nearest_neighbors.pyx in cuml.neighbors.nearest_neighbors.NearestNeighbors._kneighbors()

cuml/neighbors/nearest_neighbors.pyx in cuml.neighbors.nearest_neighbors.NearestNeighbors._kneighbors_dense()

/usr/local/lib/python3.7/site-packages/cuml/common/array.py in full(cls, shape, value, dtype, order)
    326         """
    327 
--> 328         return CumlArray(cp.full(shape, value, dtype, order))
    329 
    330     @classmethod

TypeError: full() takes from 2 to 3 positional arguments but 4 were given

在 Google Colab Pro 上采取的步骤(重现错误)

这是一个示例,我使用 Rapids (https://colab.research.google.com/drive/1rY7Ln6rEE1pOlfSHCYOVaqt8OvDO35J0#forceEdit=true&offline=true&sandboxMode=true) 中的示例安装相关包:

# Install RAPIDS
!git clone https://github.com/rapidsai/rapidsai-csp-utils.git
!bash rapidsai-csp-utils/colab/rapids-colab.sh stable

import sys, os, shutil

sys.path.append('/usr/local/lib/python3.7/site-packages/')
os.environ['NUMBAPRO_NVVM'] = '/usr/local/cuda/nvvm/lib64/libnvvm.so'
os.environ['NUMBAPRO_LIBDEVICE'] = '/usr/local/cuda/nvvm/libdevice/'
os.environ["CONDA_PREFIX"] = "/usr/local"
for so in ['cudf', 'rmm', 'nccl', 'cuml', 'cugraph', 'xgboost', 'cuspatial']:
  fn = 'lib'+so+'.so'
  source_fn = '/usr/local/lib/'+fn
  dest_fn = '/usr/lib/'+fn
  if os.path.exists(source_fn):
    print(f'Copying {source_fn} to {dest_fn}')
    shutil.copyfile(source_fn, dest_fn)
# fix for BlazingSQL import issue
# ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /usr/local/lib/python3.7/site-packages/../../libblazingsql-engine.so)
if not os.path.exists('/usr/lib64'):
    os.makedirs('/usr/lib64')
for so_file in os.listdir('/usr/local/lib'):
  if 'libstdc' in so_file:
    shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib64/'+so_file)
    shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib/x86_64-linux-gnu/'+so_file)

然后我尝试 运行 以下来自 cuML 的示例 (https://docs.rapids.ai/api/cuml/stable/api.html#k-means-clustering)

from cuml.neighbors import KNeighborsRegressor

from sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split

X, y = make_blobs(n_samples=100, centers=5,
                  n_features=10)

knn = KNeighborsRegressor(n_neighbors=10)

X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.80)

knn.fit(X_train, y_train)

knn.predict(X_test)

这将导致问题开始时出现错误。

尽管 conda 在 RAPIDS 安装期间安装了 cupy==8.6.0,但 Colab 保留了 cupy==7.4.0。这是自定义安装。在安装 RAPIDS 之前,我刚刚成功通过 pip 安装 cupy-cuda110==8.6.0

!pip install cupy-cuda110==8.6.0:

我会尽快更新脚本,这样您就不必手动执行,但想测试更多内容。再次感谢您告知我们!

编辑:脚本已更新。