Could not load dynamic library ''; dlerror: cannot open shared object file: No such file or directory

Could not load dynamic library ''; dlerror: cannot open shared object file: No such file or directory


sudo apt install nvidia-driver-470
sudo apt install cuda-drivers-470

我决定以这种方式安装它们,因为它们在尝试 sudo apt upgrade 时受到阻碍。然后我错误地做了 sudo apt autoremove 来清理旧包。重新启动计算机以正确设置新驱动程序后,我无法再使用 tensorflow 的 GPU 加速。

import tensorflow as tf
WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2021-12-07 16:52:01.771391: I tensorflow/core/platform/] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-07 16:52:01.807283: I tensorflow/stream_executor/cuda/] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 16:52:01.807973: W tensorflow/stream_executor/platform/default/] Could not load dynamic library ''; dlerror: cannot open shared object file: No such file or directory
2021-12-07 16:52:01.808017: W tensorflow/stream_executor/platform/default/] Could not load dynamic library ''; dlerror: cannot open shared object file: No such file or directory
2021-12-07 16:52:01.808048: W tensorflow/stream_executor/platform/default/] Could not load dynamic library ''; dlerror: cannot open shared object file: No such file or directory
2021-12-07 16:52:01.856391: W tensorflow/stream_executor/platform/default/] Could not load dynamic library ''; dlerror: cannot open shared object file: No such file or directory
2021-12-07 16:52:01.856466: W tensorflow/stream_executor/platform/default/] Could not load dynamic library ''; dlerror: cannot open shared object file: No such file or directory
2021-12-07 16:52:01.857601: W tensorflow/core/common_runtime/gpu/] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

您可以在 /usr/lib/x86_64-linux-gnu 目录中创建符号链接。我通过以下方式找到它:

$ whereis libcudart
libcudart: /usr/lib/x86_64-linux-gnu/ /usr/share/man/man7/libcudart.7.gz

在此文件夹中,您可以找到这些 cuda 库的其他版本。然后像这样创建符号链接。您链接到的特定版本可能略有不同。

$ sudo ln -s
$ sudo ln -s
$ sudo ln -s
$ sudo ln -s

现在应该检测到您的 GPU。

import tensorflow as tf
>>> tf.test.is_gpu_available()
WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2021-12-07 17:07:26.914296: I tensorflow/core/platform/] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-07 17:07:26.950731: I tensorflow/stream_executor/cuda/] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 17:07:27.029687: I tensorflow/stream_executor/cuda/] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 17:07:27.030421: I tensorflow/stream_executor/cuda/] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 17:07:27.325218: I tensorflow/stream_executor/cuda/] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 17:07:27.325642: I tensorflow/stream_executor/cuda/] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 17:07:27.326022: I tensorflow/stream_executor/cuda/] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 17:07:27.326408: I tensorflow/core/common_runtime/gpu/] Created device /device:GPU:0 with 9280 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:06:00.0, compute capability: 8.6

这种方法之所以有效,是因为这些 cuda 库非常相似,甚至 NVIDIA 也经常使用符号链接来构建它们。如果 tensorflow 正在寻找,您可以使用该名称创建一个文件,该文件仅指向已安装的另一个版本的 libcublas。

你安装了cuda-toolkit了吗?该错误表明找不到版本 11 的库。问题是 cudatoolkit 和 cudnn 版本可能与您的 tensorflow 版本不兼容。

如果您已经安装了正确版本的工具包,请直接转到步骤 5。(您可以使用命令 nvcc --version 检查版本)。

  1. 下载安装程序(此版本与您安装的驱动程序 nvidia-470 兼容)。接下来的步骤特定于 runfile 选项。

  2. 因为您已经安装了 nvidia-drivers,如果出现此消息,请按 Continue

  3. 接受条款。

  4. 同样,因为您已经安装了驱动程序,只需禁用驱动程序选项并按 Install

  5. 现在您需要配置二进制文件和库的路径。使用 find 命令搜索*:

    sudo find / -name 'nvcc'  # Path to binaries
    sudo find / -name '*'  # Path to libraries
  6. 最后,根据您在上面找到的路径,在文件 ~/.profile 的末尾添加下一行。 Cuda 安装在我系统的 /usr/local/cuda-11.4 上。

    if [ -d "/usr/local/cuda-11.4" ]; then

如果更新 ~\.profile 不起作用,请尝试更新 .bashrc.zshrc(以防您使用 zsh 而不是 bash)。

  1. 重新启动计算机。