Docker 中的 CUDA 版本与 WSL2 后端不匹配

CUDA Version mismatch in Docker with WSL2 backend

我正在尝试将 docker(Docker Desktop for Windows 10 Pro)与 WSL2 后端(WINDOWS SUBSHELL LINUX (WSL) (Ubuntu 20.04.4 LTS))一起使用。

那部分似乎工作正常,除了我想将我的 GPU (Nvidia RTX A5000) 传递到我的 docker 容器。

在我走得那么远之前,我仍在尝试进行设置。 I found a very good tutorial针对的是18.04,发现20.04的步骤都是一样的,只是版本号颠倒了一些。

最后发现我的Cuda版本不匹配。你可以在这里看到,.

真正的问题是当我尝试 运行 测试命令时 as shown on the docker website:

 docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

我收到这个错误:

 --> docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380:
starting container process caused: process_linux.go:545: container init caused: Running
hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli:
requirement error: unsatisfied condition: cuda>=11.6, please update your driver to a
newer version, or use an earlier cuda container: unknown.

...我只是不知道该怎么做,或者我该如何解决这个问题。

谁能解释一下如何让 GPU 成功传递到 docker 容器。

please update your driver to a newer version when using WSL, the driver in your WSL setup is not something you install in WSL, it is provided by the driver on the windows side. Your WSL driver is 472.84 and this is too old to work with CUDA 11.6 (it only supports up to CUDA 11.4). So you would need to update your windows side driver to the latest one possible for your GPU, if you want to run a CUDA 11.6 test case. Regarding the "mismatch" of CUDA versions, this provides general background material for interpretation.

正在下载最新的 Nvidia 驱动程序:

Version:             R510 U3 (511.79)  WHQL
Release Date:        2022.2.14
Operating System:    Windows 10 64-bit, Windows 11
Language:            English (US)
File Size:           640.19 MB

现在我可以支持 CUDA 11.6,docker 文档中的测试现在可以运行了:

--> docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
        -fullscreen       (run n-body simulation in fullscreen mode)
        -fp64             (use double precision floating point values for simulation)
        -hostmem          (stores simulation data in host memory)
        -benchmark        (run benchmark to measure performance)
        -numbodies=<N>    (number of bodies (>= 1) to run in simulation)
        -device=<d>       (where d=0,1,2.... for the CUDA device to use)
        -numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
        -compare          (compares simulation results running once on the default GPU and once on the CPU)
        -cpu              (run n-body simulation on the CPU)
        -tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Ampere" with compute capability 8.6

> Compute 8.6 CUDA device: [NVIDIA RTX A5000]
65536 bodies, total time for 10 iterations: 58.655 ms
= 732.246 billion interactions per second
= 14644.916 single-precision GFLOP/s at 20 flops per interaction

感谢您的快速回复!