如何在 Kubernetes 中传递 Docker CLI `--gpus` 选项或启用 GPU 支持而不安装 `nvidia-docker2` (Docker 19.03)

How to pass Docker CLI `--gpus` Options in Kubernetes or enable GPU support without installing `nvidia-docker2` (Docker 19.03)

我目前正在使用 Docker 19.03 和 Kubernetes 1.13.5 以及 Rancher 2.2.4。从 19.03 开始​​,Docker 通过 --gpus 选项正式支持原生 NVIDIA GPU。示例(来自 NVIDIA/nvidia-docker github):

 docker run --gpus all nvidia/cuda nvidia-smi

但是在 Kubernetes 中,没有传递 Docker CLI 选项的选项。所以如果我需要运行一个GPU实例,就得安装nvidia-docker2,使用起来不方便

是否可以通过 Docker CLI 选项或通过 NVIDIA 运行time 而无需安装 nvidia-docker2

GPU's are scheduled via device plugins 在 Kubernetes 中。

The official NVIDIA GPU device plugin has the following requirements:

  • Kubernetes nodes have to be pre-installed with NVIDIA drivers.
  • Kubernetes nodes have to be pre-installed with nvidia-docker 2.0
  • nvidia-container-runtime must be configured as the default runtime for docker instead of runc.
  • NVIDIA drivers ~= 361.93

设置节点后,GPU 将成为您规范中的另一种资源,如 cpumemory

spec:
  containers:
  - name: gpu-thing
    image: whatever
    resources:
      limits:
        nvidia.com/gpu: 1