scikit-learn 会使用 GPU 吗？

Question

阅读 TensorFlow 中 scikit-learn 的实现：http://learningtensorflow.com/lesson6/ and scikit-learn: http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html 我正在努力决定使用哪个实现。

scikit-learn 作为 tensorflow docker 容器的一部分安装，因此可以使用任一实现。

使用 scikit-learn 的原因：

scikit-learn contains less boilerplate than the tensorflow implementation.

使用tensorflow的原因：

If running on Nvidia GPU the algorithm will be run against in parallel , I'm not sure if scikit-learn will utilize all available GPUs?

阅读https://www.quora.com/What-are-the-main-differences-between-TensorFlow-and-SciKit-Learn

TensorFlow is more low-level; basically, the Lego bricks that help you to implement machine learning algorithms whereas scikit-learn offers you off-the-shelf algorithms, e.g., algorithms for classification such as SVMs, Random Forests, Logistic Regression, and many, many more. TensorFlow shines if you want to implement deep learning algorithms, since it allows you to take advantage of GPUs for more efficient training.

此声明重申了我的断言，即“scikit-learn 包含的样板文件少于 tensorflow 实现”，但也表明 scikit-learn 不会利用所有可用的 GPU？

Answer 1

Tensorflow 仅在针对 Cuda 和 CuDNN 构建时才使用 GPU。默认情况下它不使用 GPU，特别是如果它在 Docker 内运行，除非您使用 nvidia-docker 和具有内置支持的图像。

Scikit-learn 不打算用作深度学习框架，也不提供任何 GPU 支持。

Why is there no support for deep or reinforcement learning / Will there be support for deep or reinforcement learning in scikit-learn?

Deep learning and reinforcement learning both require a rich vocabulary to define an architecture, with deep learning additionally requiring GPUs for efficient computing. However, neither of these fit within the design constraints of scikit-learn; as a result, deep learning and reinforcement learning are currently out of scope for what scikit-learn seeks to achieve.

摘自http://scikit-learn.org/stable/faq.html#why-is-there-no-support-for-deep-or-reinforcement-learning-will-there-be-support-for-deep-or-reinforcement-learning-in-scikit-learn

Will you add GPU support in scikit-learn?

No, or at least not in the near future. The main reason is that GPU support will introduce many software dependencies and introduce platform specific issues. scikit-learn is designed to be easy to install on a wide variety of platforms. Outside of neural networks, GPUs don’t play a large role in machine learning today, and much larger gains in speed can often be achieved by a careful choice of algorithms.

摘自http://scikit-learn.org/stable/faq.html#will-you-add-gpu-support

Answer 2

我正在试验一种嵌入式解决方案 (h2o4gpu) 以利用 GPU 加速，特别是对于 Kmeans：

试试这个：

from h2o4gpu.solvers import KMeans
#from sklearn.cluster import KMeans

截至目前，版本 0.3.2 仍然没有 .inertia_ 但我认为它在他们的 TODO 列表中。

编辑：尚未测试，但 scikit-cuda 似乎越来越受欢迎。

编辑：RAPIDS 真的是去这里的路。

Answer 3

根据我的经验，我使用 this package to utilize GPU for some sklearn algorithms in here。

我使用的代码：

import numpy as np
import dpctl
from sklearnex import patch_sklearn, config_context
patch_sklearn()

from sklearn.cluster import DBSCAN

X = np.array([[1., 2.], [2., 2.], [2., 3.],
            [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
with config_context(target_offload="gpu:0"):
    clustering = DBSCAN(eps=3, min_samples=2).fit(X)

来源：oneAPI and GPU support in Intel(R) Extension for Scikit-learn

scikit-learn 会使用 GPU 吗？

Will scikit-learn utilize GPU?

python

k-means

scikit-learn

tensorflow

neuraxle