Tensorflow 2.2 GPU - 要安装哪个 cuDNN 库?
Tensorflow 2.2 GPU - which cuDNN library to install?
我已经成功安装了 CUDA 驱动程序、cuDNN 库和 tensorflow。但是当运行一个简单导入tensorflow的测试程序时,我就报错了。该错误似乎表明我安装了错误版本的 cuDNN 库。我很感激这方面的帮助。如果我需要降级 cuDNN,我该怎么做?
Tensorflow 版本:2.2 GPU
OS:Ubuntu 16.04.6 LTS(GNU/Linux 4.4.0-184-通用 x86_64)
nvcc -V 显示以下信息:
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_14:27:32_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17
nvidia-smi 显示以下信息:
Fri Jun 12 17:16:38 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06 Driver Version: 450.36.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 980 Ti Off | 00000000:02:00.0 Off | N/A |
| 22% 27C P8 17W / 250W | 74MiB / 6083MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1489 G /usr/lib/xorg/Xorg 71MiB |
+-----------------------------------------------------------------------------+
cuDNN 按照说明 https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#download 成功安装,但我想我安装了 11.0 版。
程序尝试导入tensorflow时出现错误消息(python 3.6)
2020-06-12 17:21:38.131160: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:02:00.0 name: GeForce GTX 980 Ti computeCapability: 5.2
coreClock: 1.228GHz coreCount: 22 deviceMemorySize: 5.94GiB deviceMemoryBandwidth: 313.37GiB/s
2020-06-12 17:21:38.131384: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-06-12 17:21:38.131498: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory
2020-06-12 17:21:38.133367: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-06-12 17:21:38.133807: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-06-12 17:21:38.137813: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-06-12 17:21:38.137958: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10'; dlerror: libcusparse.so.10: cannot open shared object file: No such file or directory
2020-06-12 17:21:38.138063: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory
2020-06-12 17:21:38.138085: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-06-12 17:21:38.138114: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-12 17:21:38.138131: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-06-12 17:21:38.138152: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
按照以下步骤,对于 tensorflow 2.2,您需要 CUDA 10.1 和 cuDNN 7.4:
https://www.tensorflow.org/install/source_windows#tested_build_configurations
CUDA archive/legacy 版本:https://developer.nvidia.com/cuda-toolkit-archive
cuDNN 存档,您必须创建一个 nvidia 帐户才能访问:https://developer.nvidia.com/rdp/cudnn-archive
需要特别注意的是,7.4 版本中没有与 10.1 兼容的 cuDNN,所以我会尝试 7.5.0。安装 cuDNN 只需将下载的文件复制到安装 CUDA 的文件夹(在各自的文件夹中)即可。
我已经成功安装了 CUDA 驱动程序、cuDNN 库和 tensorflow。但是当运行一个简单导入tensorflow的测试程序时,我就报错了。该错误似乎表明我安装了错误版本的 cuDNN 库。我很感激这方面的帮助。如果我需要降级 cuDNN,我该怎么做?
Tensorflow 版本:2.2 GPU OS:Ubuntu 16.04.6 LTS(GNU/Linux 4.4.0-184-通用 x86_64) nvcc -V 显示以下信息:
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_14:27:32_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17
nvidia-smi 显示以下信息:
Fri Jun 12 17:16:38 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06 Driver Version: 450.36.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 980 Ti Off | 00000000:02:00.0 Off | N/A |
| 22% 27C P8 17W / 250W | 74MiB / 6083MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1489 G /usr/lib/xorg/Xorg 71MiB |
+-----------------------------------------------------------------------------+
cuDNN 按照说明 https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#download 成功安装,但我想我安装了 11.0 版。
程序尝试导入tensorflow时出现错误消息(python 3.6)
2020-06-12 17:21:38.131160: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:02:00.0 name: GeForce GTX 980 Ti computeCapability: 5.2
coreClock: 1.228GHz coreCount: 22 deviceMemorySize: 5.94GiB deviceMemoryBandwidth: 313.37GiB/s
2020-06-12 17:21:38.131384: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-06-12 17:21:38.131498: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory
2020-06-12 17:21:38.133367: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-06-12 17:21:38.133807: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-06-12 17:21:38.137813: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-06-12 17:21:38.137958: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10'; dlerror: libcusparse.so.10: cannot open shared object file: No such file or directory
2020-06-12 17:21:38.138063: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory
2020-06-12 17:21:38.138085: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-06-12 17:21:38.138114: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-12 17:21:38.138131: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-06-12 17:21:38.138152: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
按照以下步骤,对于 tensorflow 2.2,您需要 CUDA 10.1 和 cuDNN 7.4:
https://www.tensorflow.org/install/source_windows#tested_build_configurations
CUDA archive/legacy 版本:https://developer.nvidia.com/cuda-toolkit-archive
cuDNN 存档,您必须创建一个 nvidia 帐户才能访问:https://developer.nvidia.com/rdp/cudnn-archive
需要特别注意的是,7.4 版本中没有与 10.1 兼容的 cuDNN,所以我会尝试 7.5.0。安装 cuDNN 只需将下载的文件复制到安装 CUDA 的文件夹(在各自的文件夹中)即可。