如何在 Kubernetes 中传递 Docker CLI `--gpus` 选项或启用 GPU 支持而不安装 `nvidia-docker2` (Docker 19.03)

Question

我目前正在使用 Docker 19.03 和 Kubernetes 1.13.5 以及 Rancher 2.2.4。从 19.03 开始，Docker 通过 --gpus 选项正式支持原生 NVIDIA GPU。示例（来自 NVIDIA/nvidia-docker github）：

 docker run --gpus all nvidia/cuda nvidia-smi

但是在 Kubernetes 中，没有传递 Docker CLI 选项的选项。所以如果我需要运行一个GPU实例，就得安装nvidia-docker2，使用起来不方便

是否可以通过 Docker CLI 选项或通过 NVIDIA 运行time 而无需安装 nvidia-docker2

Answer 1

GPU's are scheduled via device plugins 在 Kubernetes 中。

The official NVIDIA GPU device plugin has the following requirements:

Kubernetes nodes have to be pre-installed with NVIDIA drivers.

Kubernetes nodes have to be pre-installed with nvidia-docker 2.0

nvidia-container-runtime must be configured as the default runtime for docker instead of runc.

NVIDIA drivers ~= 361.93

设置节点后，GPU 将成为您规范中的另一种资源，如 cpu 或 memory。

spec:
  containers:
  - name: gpu-thing
    image: whatever
    resources:
      limits:
        nvidia.com/gpu: 1

如何在 Kubernetes 中传递 Docker CLI `--gpus` 选项或启用 GPU 支持而不安装 `nvidia-docker2` (Docker 19.03)

How to pass Docker CLI `--gpus` Options in Kubernetes or enable GPU support without installing `nvidia-docker2` (Docker 19.03)

docker

kubernetes

rancher

nvidia-docker