Pytorch 说 CUDA 不可用
Pytorch says that CUDA is not available
我正在尝试 运行 Pytorch 在我的笔记本电脑上。这是一个较旧的型号,但它确实有一个 Nvidia 显卡。我意识到这可能不足以用于真正的机器学习,但我正在尝试这样做,以便我可以了解安装 CUDA 的过程。
我已经按照 installation guide 上的步骤为 Ubuntu 18.04(我的特定发行版是 Xubuntu)。
我的显卡是GeForce 845M,通过lspci | grep nvidia
验证:
01:00.0 3D controller: NVIDIA Corporation GM107M [GeForce 845M] (rev a2)
01:00.1 Audio device: NVIDIA Corporation Device 0fbc (rev a1)
我还安装了 gcc 7.5,已通过 gcc --version
验证
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
并且我安装了正确的 headers,通过尝试使用 sudo apt-get install linux-headers-$(uname -r)
:
安装它们进行了验证
Reading package lists... Done
Building dependency tree
Reading state information... Done
linux-headers-4.15.0-106-generic is already the newest version (4.15.0-106.107).
然后我按照安装说明使用本地 .deb 版本 10.1。
Npw,当我 运行 nvidia-smi
,我得到:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce 845M On | 00000000:01:00.0 Off | N/A |
| N/A 40C P0 N/A / N/A | 88MiB / 2004MiB | 1% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 982 G /usr/lib/xorg/Xorg 87MiB |
+-----------------------------------------------------------------------------+
我 运行 nvcc -V
我得到:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
然后我执行了 section 6.1 的 post-installation 指令,结果 echo $PATH
看起来像这样:
/home/isaek/anaconda3/envs/stylegan2_pytorch/bin:/home/isaek/anaconda3/bin:/home/isaek/anaconda3/condabin:/usr/local/cuda-10.1/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
echo $LD_LIBRARY_PATH
看起来像这样:
/usr/local/cuda-10.1/lib64
我的 /etc/udev/rules.d/40-vm-hotadd.rules
文件如下所示:
# On Hyper-V and Xen Virtual Machines we want to add memory and cpus as soon as they appear
ATTR{[dmi/id]sys_vendor}=="Microsoft Corporation", ATTR{[dmi/id]product_name}=="Virtual Machine", GOTO="vm_hotadd_apply"
ATTR{[dmi/id]sys_vendor}=="Xen", GOTO="vm_hotadd_apply"
GOTO="vm_hotadd_end"
LABEL="vm_hotadd_apply"
# Memory hotadd request
# CPU hotadd request
SUBSYSTEM=="cpu", ACTION=="add", DEVPATH=="/devices/system/cpu/cpu[0-9]*", TEST=="online", ATTR{online}="1"
LABEL="vm_hotadd_end"
完成所有这些之后,我什至编译并 运行 样本。 ./deviceQuery
returns:
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce 845M"
CUDA Driver Version / Runtime Version 10.1 / 10.1
CUDA Capability Major/Minor version number: 5.0
Total amount of global memory: 2004 MBytes (2101870592 bytes)
( 4) Multiprocessors, (128) CUDA Cores/MP: 512 CUDA Cores
GPU Max Clock rate: 863 MHz (0.86 GHz)
Memory Clock rate: 1001 Mhz
Memory Bus Width: 64-bit
L2 Cache Size: 1048576 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: No
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.1, NumDevs = 1
Result = PASS
和./bandwidthTest
returns:
[CUDA Bandwidth Test] - Starting...
Running on...
Device 0: GeForce 845M
Quick Mode
Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 11.7
Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 11.8
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 14.5
Result = PASS
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
但在所有这一切之后,这个 Python 片段(在安装了所有依赖项的 conda 环境中):
import torch
torch.cuda.is_available()
returns False
有人知道如何解决这个问题吗?我尝试像这样将 /usr/local/cuda-10.1/bin
添加到 etc/environment
:
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"
PATH=$PATH:/usr/local/cuda-10.1/bin
并重新启动终端,但这并没有解决问题。我真的不知道还能尝试什么。
编辑 - @kHarshit
的 collect_env 结果
Collecting environment information...
PyTorch version: 1.5.0
Is debug build: No
CUDA used to build PyTorch: 10.2
OS: Ubuntu 18.04.4 LTS
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CMake version: Could not collect
Python version: 3.6
Is CUDA available: No
CUDA runtime version: 10.1.243
GPU models and configuration: GPU 0: GeForce 845M
Nvidia driver version: 418.87.00
cuDNN version: Could not collect
Versions of relevant libraries:
[pip] numpy==1.18.5
[pip] pytorch-ranger==0.1.1
[pip] stylegan2-pytorch==0.12.0
[pip] torch==1.5.0
[pip] torch-optimizer==0.0.1a12
[pip] torchvision==0.6.0
[pip] vector-quantize-pytorch==0.0.2
[conda] numpy 1.18.5 pypi_0 pypi
[conda] pytorch-ranger 0.1.1 pypi_0 pypi
[conda] stylegan2-pytorch 0.12.0 pypi_0 pypi
[conda] torch 1.5.0 pypi_0 pypi
[conda] torch-optimizer 0.0.1a12 pypi_0 pypi
[conda] torchvision 0.6.0 pypi_0 pypi
[conda] vector-quantize-pytorch 0.0.2 pypi_0 pypi
PyTorch 不使用系统的 CUDA 库。当您使用 pip
或 conda
使用预编译的二进制文件安装 PyTorch 时,它附带了本地安装的指定版本的 CUDA 库的副本。事实上,您甚至不需要在系统上安装 CUDA 即可使用支持 CUDA 的 PyTorch。
有两种情况可能会导致您的问题。
您安装了 CPU 唯一版本的 PyTorch。在这种情况下,PyTorch 未使用 CUDA 支持进行编译,因此它不支持 CUDA。
您安装了 CUDA 10.2 版本的 PyTorch。在这种情况下,问题是您的显卡当前使用的是 418.87 驱动程序,它最多只支持 CUDA 10.1。在这种情况下,两个可能的修复方法是安装更新的驱动程序(版本 >= 440.33,根据 Table 2)或安装针对 CUDA 10.1 编译的 PyTorch 版本。
要确定在安装 PyTorch 时要使用的适当命令,您可以使用 pytorch.org 的“安装 PyTorch”部分中的便捷小部件。只需 select 合适的操作系统、程序包管理器和 CUDA 版本,然后 运行 推荐的命令。
在您的情况下,一种解决方案是使用
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
它明确指定要安装针对 CUDA 10.1 编译的 PyTorch 版本的 conda。
有关 PyTorch CUDA 与相关驱动程序和硬件的兼容性的更多信息,请参阅 。
编辑 添加 collect_env
的输出后,我们可以看到问题是您安装了 CUDA 10.2 版本的 PyTorch。基于此,另一种解决方案是更新图形驱动程序,如第 2 项和链接的答案中所述。
就我而言,只是重新启动我的机器使 GPU 再次激活。我收到的最初消息是 GPU 当前正被另一个应用程序使用。但是当我看 nvidia-smi
时,我什么也没有看到。因此,没有更改依赖项,它又开始工作了。
TL;博士
- 安装 Canonical 或 NVIDIA third-party PPA 提供的 NVIDIA 工具包。
- 重新启动您的工作站。
- 创建cleanPython虚拟环境(或重新安装所有CUDA依赖包)。
描述
首先安装 Canonical 提供的 NVIDIA CUDA Toolkit:
sudo apt install -y nvidia-cuda-toolkit
或关注NVIDIA developers instructions:
# ENVARS ADDED **ONLY FOR READABILITY**
NVIDIA_CUDA_PPA=https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/
NVIDIA_CUDA_PREFERENCES=https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
NVIDIA_CUDA_PUBKEY=https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
# Add NVIDIA Developers 3rd-Party PPA
sudo wget ${NVIDIA_CUDA_PREFERENCES} -O /etc/apt/preferences.d/nvidia-cuda
sudo apt-key adv --fetch-keys ${NVIDIA_CUDA_PUBKEY}
echo "deb ${NVIDIA_CUDA_PPA} /" | sudo tee /etc/apt/sources.list.d/nvidia-cuda.list
# Install development tools
sudo apt update
sudo apt install -y cuda
然后 重新启动 OS 使用 NVIDIA 驱动程序加载内核
conda create -n stack-overflow pytorch torchvision
conda activate stack-overflow
或将pytorch
and torchvision
重新安装到现有的:
conda activate stack-overflow
conda install --force-reinstall pytorch torchvision
否则可能无法正确检测到 NVIDIA CUDA C/C++ 绑定。
最后确保正确检测到 CUDA:
(stack-overflow)$ python3 -c 'import torch; print(torch.cuda.is_available())'
True
版本
- NVIDIA CUDA Toolkit v11.6
- Ubuntu LTS
20.04.x
- Ubuntu LTS
22.04
(官方发布前)
我正在尝试 运行 Pytorch 在我的笔记本电脑上。这是一个较旧的型号,但它确实有一个 Nvidia 显卡。我意识到这可能不足以用于真正的机器学习,但我正在尝试这样做,以便我可以了解安装 CUDA 的过程。
我已经按照 installation guide 上的步骤为 Ubuntu 18.04(我的特定发行版是 Xubuntu)。
我的显卡是GeForce 845M,通过lspci | grep nvidia
验证:
01:00.0 3D controller: NVIDIA Corporation GM107M [GeForce 845M] (rev a2)
01:00.1 Audio device: NVIDIA Corporation Device 0fbc (rev a1)
我还安装了 gcc 7.5,已通过 gcc --version
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
并且我安装了正确的 headers,通过尝试使用 sudo apt-get install linux-headers-$(uname -r)
:
Reading package lists... Done
Building dependency tree
Reading state information... Done
linux-headers-4.15.0-106-generic is already the newest version (4.15.0-106.107).
然后我按照安装说明使用本地 .deb 版本 10.1。
Npw,当我 运行 nvidia-smi
,我得到:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce 845M On | 00000000:01:00.0 Off | N/A |
| N/A 40C P0 N/A / N/A | 88MiB / 2004MiB | 1% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 982 G /usr/lib/xorg/Xorg 87MiB |
+-----------------------------------------------------------------------------+
我 运行 nvcc -V
我得到:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
然后我执行了 section 6.1 的 post-installation 指令,结果 echo $PATH
看起来像这样:
/home/isaek/anaconda3/envs/stylegan2_pytorch/bin:/home/isaek/anaconda3/bin:/home/isaek/anaconda3/condabin:/usr/local/cuda-10.1/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
echo $LD_LIBRARY_PATH
看起来像这样:
/usr/local/cuda-10.1/lib64
我的 /etc/udev/rules.d/40-vm-hotadd.rules
文件如下所示:
# On Hyper-V and Xen Virtual Machines we want to add memory and cpus as soon as they appear
ATTR{[dmi/id]sys_vendor}=="Microsoft Corporation", ATTR{[dmi/id]product_name}=="Virtual Machine", GOTO="vm_hotadd_apply"
ATTR{[dmi/id]sys_vendor}=="Xen", GOTO="vm_hotadd_apply"
GOTO="vm_hotadd_end"
LABEL="vm_hotadd_apply"
# Memory hotadd request
# CPU hotadd request
SUBSYSTEM=="cpu", ACTION=="add", DEVPATH=="/devices/system/cpu/cpu[0-9]*", TEST=="online", ATTR{online}="1"
LABEL="vm_hotadd_end"
完成所有这些之后,我什至编译并 运行 样本。 ./deviceQuery
returns:
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce 845M"
CUDA Driver Version / Runtime Version 10.1 / 10.1
CUDA Capability Major/Minor version number: 5.0
Total amount of global memory: 2004 MBytes (2101870592 bytes)
( 4) Multiprocessors, (128) CUDA Cores/MP: 512 CUDA Cores
GPU Max Clock rate: 863 MHz (0.86 GHz)
Memory Clock rate: 1001 Mhz
Memory Bus Width: 64-bit
L2 Cache Size: 1048576 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: No
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.1, NumDevs = 1
Result = PASS
和./bandwidthTest
returns:
[CUDA Bandwidth Test] - Starting...
Running on...
Device 0: GeForce 845M
Quick Mode
Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 11.7
Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 11.8
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 14.5
Result = PASS
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
但在所有这一切之后,这个 Python 片段(在安装了所有依赖项的 conda 环境中):
import torch
torch.cuda.is_available()
returns False
有人知道如何解决这个问题吗?我尝试像这样将 /usr/local/cuda-10.1/bin
添加到 etc/environment
:
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"
PATH=$PATH:/usr/local/cuda-10.1/bin
并重新启动终端,但这并没有解决问题。我真的不知道还能尝试什么。
编辑 - @kHarshit
的 collect_env 结果Collecting environment information...
PyTorch version: 1.5.0
Is debug build: No
CUDA used to build PyTorch: 10.2
OS: Ubuntu 18.04.4 LTS
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CMake version: Could not collect
Python version: 3.6
Is CUDA available: No
CUDA runtime version: 10.1.243
GPU models and configuration: GPU 0: GeForce 845M
Nvidia driver version: 418.87.00
cuDNN version: Could not collect
Versions of relevant libraries:
[pip] numpy==1.18.5
[pip] pytorch-ranger==0.1.1
[pip] stylegan2-pytorch==0.12.0
[pip] torch==1.5.0
[pip] torch-optimizer==0.0.1a12
[pip] torchvision==0.6.0
[pip] vector-quantize-pytorch==0.0.2
[conda] numpy 1.18.5 pypi_0 pypi
[conda] pytorch-ranger 0.1.1 pypi_0 pypi
[conda] stylegan2-pytorch 0.12.0 pypi_0 pypi
[conda] torch 1.5.0 pypi_0 pypi
[conda] torch-optimizer 0.0.1a12 pypi_0 pypi
[conda] torchvision 0.6.0 pypi_0 pypi
[conda] vector-quantize-pytorch 0.0.2 pypi_0 pypi
PyTorch 不使用系统的 CUDA 库。当您使用 pip
或 conda
使用预编译的二进制文件安装 PyTorch 时,它附带了本地安装的指定版本的 CUDA 库的副本。事实上,您甚至不需要在系统上安装 CUDA 即可使用支持 CUDA 的 PyTorch。
有两种情况可能会导致您的问题。
您安装了 CPU 唯一版本的 PyTorch。在这种情况下,PyTorch 未使用 CUDA 支持进行编译,因此它不支持 CUDA。
您安装了 CUDA 10.2 版本的 PyTorch。在这种情况下,问题是您的显卡当前使用的是 418.87 驱动程序,它最多只支持 CUDA 10.1。在这种情况下,两个可能的修复方法是安装更新的驱动程序(版本 >= 440.33,根据 Table 2)或安装针对 CUDA 10.1 编译的 PyTorch 版本。
要确定在安装 PyTorch 时要使用的适当命令,您可以使用 pytorch.org 的“安装 PyTorch”部分中的便捷小部件。只需 select 合适的操作系统、程序包管理器和 CUDA 版本,然后 运行 推荐的命令。
在您的情况下,一种解决方案是使用
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
它明确指定要安装针对 CUDA 10.1 编译的 PyTorch 版本的 conda。
有关 PyTorch CUDA 与相关驱动程序和硬件的兼容性的更多信息,请参阅
编辑 添加 collect_env
的输出后,我们可以看到问题是您安装了 CUDA 10.2 版本的 PyTorch。基于此,另一种解决方案是更新图形驱动程序,如第 2 项和链接的答案中所述。
就我而言,只是重新启动我的机器使 GPU 再次激活。我收到的最初消息是 GPU 当前正被另一个应用程序使用。但是当我看 nvidia-smi
时,我什么也没有看到。因此,没有更改依赖项,它又开始工作了。
TL;博士
- 安装 Canonical 或 NVIDIA third-party PPA 提供的 NVIDIA 工具包。
- 重新启动您的工作站。
- 创建cleanPython虚拟环境(或重新安装所有CUDA依赖包)。
描述
首先安装 Canonical 提供的 NVIDIA CUDA Toolkit:
sudo apt install -y nvidia-cuda-toolkit
或关注NVIDIA developers instructions:
# ENVARS ADDED **ONLY FOR READABILITY**
NVIDIA_CUDA_PPA=https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/
NVIDIA_CUDA_PREFERENCES=https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
NVIDIA_CUDA_PUBKEY=https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
# Add NVIDIA Developers 3rd-Party PPA
sudo wget ${NVIDIA_CUDA_PREFERENCES} -O /etc/apt/preferences.d/nvidia-cuda
sudo apt-key adv --fetch-keys ${NVIDIA_CUDA_PUBKEY}
echo "deb ${NVIDIA_CUDA_PPA} /" | sudo tee /etc/apt/sources.list.d/nvidia-cuda.list
# Install development tools
sudo apt update
sudo apt install -y cuda
然后 重新启动 OS 使用 NVIDIA 驱动程序加载内核
conda create -n stack-overflow pytorch torchvision
conda activate stack-overflow
或将pytorch
and torchvision
重新安装到现有的:
conda activate stack-overflow
conda install --force-reinstall pytorch torchvision
否则可能无法正确检测到 NVIDIA CUDA C/C++ 绑定。
最后确保正确检测到 CUDA:
(stack-overflow)$ python3 -c 'import torch; print(torch.cuda.is_available())'
True
版本
- NVIDIA CUDA Toolkit v11.6
- Ubuntu LTS
20.04.x
- Ubuntu LTS
22.04
(官方发布前)