无法将符号张量转换为 Numpy 数组（使用 RTX 30xx GPU）

Question

我用谷歌搜索了每个错误，尝试了很多解决方案，但我就是无法让 TensorFlow 为我运行一个 LSTM/GRU 网络。我以前能做到这一点。

我按照规定的方式使用 Anaconda 安装了它：conda create -n tf-gpu tensorFlow-gpu，然后安装了 jupyterlab、spyder、matplotlib、scikit-learn 和 pandas，仅此而已。没有兼容性错误或警告。

我启动笔记本并试试这个：

def make_model(X_train, y_train):
    model = Sequential()
    model.add(InputLayer(input_shape = (X_train.shape[1], X_train.shape[2])))
    model.add(GRU(units = 100))
    model.add(Dense(units = 100, activation = 'relu'))
    model.add(Dropout(0.2))
    model.add(Dense(units = y_train.shape[1]))
    model.compile(loss = 'mse', optimizer = 'adam', metrics = 'mae')
    return model

但是无论我做什么，我都会遇到这个错误：

NotImplementedError: Cannot convert a symbolic Tensor
(gru_1/strided_slice:0) to a numpy array. This error may indicate that
you're trying to pass a Tensor to a NumPy call, which is not supported

我能找到的关于此错误的所有信息都表明它是一个 numpy 版本问题，我尝试使用 pip 降级到 1.18.5，但这完全破坏了我的环境。尽管 Anaconda 告诉我 python 3.9 不兼容，但我现在正在尝试这样做。但是这种追逐鹅的行为已经失控了。

据我所知，我并没有尝试做任何特别的事情，这应该是开箱即用的，如果不是，Anaconda 有什么意义？问题是，我正在重用我确定在某一时刻（大约 9 个月前）工作的代码和数据。

Answer 1

我在一个新的环境中重新开始，这次使用 conda install tensorflow-gpu 安装了 tensorflow-gpu 而不是下载一个完整的环境。使用 conda install numpy=1.18.5 将 numpy 降级到 1.18.5 后，它似乎可以正常工作！但现在 tensorflow 没有检测到我的 gpu...

>>> import tensorflow as tf
>>> print(tf.config.list_physical_devices())
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

我跟着this guide得出的结论是conda没有安装cudnn或cudatoolkit。运行 nvcc -V 在命令提示符下产生了这个输出：

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Wed_Jun__2_19:25:35_Pacific_Daylight_Time_2021
Cuda compilation tools, release 11.4, V11.4.48
Build cuda_11.4.r11.4/compiler.30033411_0

该指南要求使用 conda search cudnn 并将提供的内部版本号与 nvcc -V 列出的内部版本号相匹配，因此在我的情况下：release 11.4。当然，当我运行 conda search cudnn 我得到这个：

# Name                       Version           Build  Channel
cudnn                          7.1.4       cuda8.0_0  pkgs/main
cudnn                          7.1.4       cuda9.0_0  pkgs/main
cudnn                          7.3.1      cuda10.0_0  pkgs/main
cudnn                          7.3.1       cuda9.0_0  pkgs/main
cudnn                          7.6.0      cuda10.0_0  pkgs/main
cudnn                          7.6.0      cuda10.1_0  pkgs/main
cudnn                          7.6.0       cuda9.0_0  pkgs/main
cudnn                          7.6.4      cuda10.0_0  pkgs/main
cudnn                          7.6.4      cuda10.1_0  pkgs/main
cudnn                          7.6.4       cuda9.0_0  pkgs/main
cudnn                          7.6.5      cuda10.0_0  pkgs/main
cudnn                          7.6.5      cuda10.1_0  pkgs/main
cudnn                          7.6.5      cuda10.2_0  pkgs/main
cudnn                          7.6.5       cuda9.0_0  pkgs/main
cudnn                          7.6.5       cuda9.2_0  pkgs/main
cudnn                          8.2.1      cuda11.3_0  pkgs/main

由于没有选择，我决定在新环境中为构建 cuda11 安装 8.2.1。3_0 然后安装 tensorflow-gpu，不出所料，这不起作用。

>>> import tensorflow as tf
>>> print(tf.config.list_physical_devices())
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

所以我从 here 下载了 cuda 11.3 驱动程序，但是当我运行 nvcc -V 时，输出保持不变。我正在考虑运行ning DisplayDriverUninstaller 并重试。但是，为了让 tensorflow-gpu 正常工作，它的 2 个版本落后于最新版本！

我的硬件：锐龙 9 5950x NVIDIA RTX 3060 钛 64GB DDR4 内存

我在实际尝试 DDU 之前写这篇文章，因为我现在无法访问物理机器。如果它有任何变化，我会 post 明天回来更新。

Answer 2

可以看到针对此问题的完全不同的解决方案。我认为这对很多人来说还不够好，但是由于我今天的目标很简单，所以我要取得胜利。

重现步骤：

使用 python 3.7
安装 Cuda 10.1
重启电脑（不要跳过这个！）
在新环境中运行conda install tensorflow-gpu=2.1
然后运行 pip install tensorflow-gpu==2.3

恭喜，如果您遇到与我遇到的相同（但仍未知）的问题，现在应该已经解决了。请记住，许多其他不适用于 python <3.8 的库（或它们的更新）现在已关闭 table 并且您将使用的 tensorflow 版本已有一年的历史。

此外，tensorflow 库（非-gpu）在我的环境中仍然是 2.1 版。但在我再次破坏环境之前，我会在这里停下来把那个实验留给其他人。

edit: 事实证明它只能在命令提示符下工作并且没有错误地崩溃。从 spyder 的 Ipython 控制台尝试了一些东西（说实话不知道它是如何工作的），没有用。

Answer 3

最终确定答案：

硬件：

锐龙 9 5950X
64GB DDR4 内存
RTX 3060 钛

我真的很想和 Anaconda 一起工作，因为我对它非常熟悉，而且我所做的一切都在 Anaconda 中进行。最重要的是，去年我让它在 Anaconda 中工作没问题，所以它必须是可能的！

问题：

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import LSTM
from numpy.random import rand


X, y = rand(8000, 50, 5), rand(8000, 10)

model = keras.Sequential()
model.add(keras.Input(shape = (X.shape[1], X.shape[2])))

到目前为止一切正常。

下一行：

model.add(LSTM(units = 100))

产生以下错误：

NotImplementedError: Cannot convert a symbolic Tensor
(lstm_1/strided_slice:0) to a numpy array. This error may indicate that
you're trying to pass a Tensor to a NumPy call, which is not supported

原因/解决方法： 要获得明确的答案，我必须将您推荐给 Tensorflow 的开发人员，但我能够推断出以下内容：

和我有完全相同的问题，它是通过将 numpy 从 1.20.x 降级到 1.19.x 来解决的。关于 post 的讨论很有趣，基本上 Tensorflow 版本 >2.3.x 是用 numpy 1.19.5 编译的。 Anaconda 在使用 conda install tensorflow-gpu 时默认安装版本 1.20.x，它们不能很好地播放。降级本身很容易解决。

如果您有 NVIDIA RTX 30xx GPU，那么您还没有完成！

长话短说，RTX 30xx采用Ampere架构，这需要较新版本的CUDA，这需要较新版本的Tensorflow，准确地说是版本>2.4.x。截至撰写本文时，此版本在 conda.

上不可用

因此，conda 自动安装 cuDNN 和 cudatoolkit 所提供的所有便利不再可用。简单地 pip install tensorflow=2.4.0 是行不通的。最糟糕的是，它可能看起来一直在工作，直到训练了一个多小时才突然停止并出现完全随机的错误。 (sorry，我这时候已经准备暴走了，来晚了，没有记下错误，有很多，都没有解决。)

This guide 详细介绍了如何从源代码编译 cuDNN 和 CUDA。在您遵循本指南之前：如果您进入控制面板 > 程序和功能并从 NVIDIA 卸载所有内容 那不是: NVIDIA graphics driver, NVIDIA geforce experience, NVIDIA HD audio driver, NVIDIA PhysX.

另外重要提示：

在步骤 Building CUDA/cuDNN: Set 3 中有一个严重的拼写错误。该指南指示您复制文件

来自：

# 1. cuDNN
\...\cudnn-11.0-windows-x64-v8.0.4.30.zip\cuda\bin

至：

# 2. NVIDIA GPU Computing Toolkit
\...\NVIDIA GPU Computing Toolkit\CUDA\v11.0\include

这是不正确的！！

应该来自：

# 1. cuDNN
\...\cudnn-11.0-windows-x64-v8.0.4.30.zip\cuda\bin

至：

# 2. NVIDIA GPU Computing Toolkit
\...\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin

按照本指南进行操作后，我 重新启动了我的电脑（不要跳过），使用 python 3.8.11 创建了一个新环境：

conda create -n tf python=3.8

我直接从命令提示符和我的新 tf 环境中使用 pip 安装了 tensorflow 2.4.0：

pip install tensorflow==2.4.0

这也会安装 tensorflow 的 gpu 功能，而 anaconda 版本仅在调用 conda install tensorflow 时才安装 cpu。当然，它仍然不起作用，您现在已经安装了 numpy 1.20.3（您可以使用 conda list numpy 进行检查）。只需使用 conda install numpy=1.19 即可将其降级。最重要的是，在我的系统上，指南中提供的示例：

from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt

(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

train_images, test_images = train_images / 255.0, test_images / 255.0

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))

model.compile(optimizer='Adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True))

history = model.fit(train_images, train_labels, batch_size=10, epochs=100)

会抛出一个错误（至少对我来说是这样）：

NotFoundError:  No algorithm worked!
     [[node sequential/conv2d/Relu (defined at <ipython-input-1-bf665ec77ee4>:18) ]] [Op:__inference_train_function_580]

但是，我们对这个例子不感兴趣，我们想要运行 LSTM / GRU，并且不修复这个例子。因此我们将丢弃它并继续，现在我们将尝试：

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import LSTM
from numpy.random import rand


X, y = rand(8000, 50, 5), rand(8000, 10)

model = keras.Sequential()
model.add(keras.Input(shape = (X.shape[1], X.shape[2])))           

model.add(LSTM(units = 100))
model.add(Dense(units = 10))

低看，没有错误！

model.compile(loss = 'mse', optimizer = 'adam')

仍然没有错误！

history = model.fit(X, y, epochs = 10)

仍然没有错误！，它甚至使用了 GPU 吗？控制台中的消息似乎确实表明了这一点：

2021-08-19 13:04:09.234795: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
Default GPU Device: /device:GPU:0
training model

2021-08-19 13:04:09.234795: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-08-19 13:04:10.645028: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-19 13:04:10.647857: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2021-08-19 13:04:10.662783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:0a:00.0 name: NVIDIA GeForce RTX 3060 Ti computeCapability: 8.6
coreClock: 1.755GHz coreCount: 38 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2021-08-19 13:04:10.662799: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-08-19 13:04:10.667119: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-08-19 13:04:10.667133: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-08-19 13:04:10.669347: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-08-19 13:04:10.670066: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-08-19 13:04:10.675548: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-08-19 13:04:10.677202: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-08-19 13:04:10.677612: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-08-19 13:04:10.677658: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-08-19 13:04:10.979738: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-19 13:04:10.979763: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0 
2021-08-19 13:04:10.979770: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N 
2021-08-19 13:04:10.979886: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/device:GPU:0 with 6617 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3060 Ti, pci bus id: 0000:0a:00.0, compute capability: 8.6)
2021-08-19 13:04:10.980387: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-08-19 13:04:10.980542: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:0a:00.0 name: NVIDIA GeForce RTX 3060 Ti computeCapability: 8.6
coreClock: 1.755GHz coreCount: 38 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2021-08-19 13:04:10.980555: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-08-19 13:04:10.980563: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-08-19 13:04:10.980569: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-08-19 13:04:10.980575: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-08-19 13:04:10.980580: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-08-19 13:04:10.980586: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-08-19 13:04:10.980592: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-08-19 13:04:10.980646: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-08-19 13:04:10.980676: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-08-19 13:04:10.980693: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-19 13:04:10.980698: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0 
2021-08-19 13:04:10.980703: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N 
2021-08-19 13:04:10.980744: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/device:GPU:0 with 6617 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3060 Ti, pci bus id: 0000:0a:00.0, compute capability: 8.6)
2021-08-19 13:04:10.980757: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-08-19 13:04:10.984016: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-08-19 13:04:10.984082: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:0a:00.0 name: NVIDIA GeForce RTX 3060 Ti computeCapability: 8.6
coreClock: 1.755GHz coreCount: 38 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2021-08-19 13:04:10.984094: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-08-19 13:04:10.984100: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-08-19 13:04:10.984106: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-08-19 13:04:10.984112: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-08-19 13:04:10.984117: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-08-19 13:04:10.984122: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-08-19 13:04:10.984127: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-08-19 13:04:10.984132: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-08-19 13:04:10.984158: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-08-19 13:04:10.984332: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:0a:00.0 name: NVIDIA GeForce RTX 3060 Ti computeCapability: 8.6
coreClock: 1.755GHz coreCount: 38 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2021-08-19 13:04:10.984344: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-08-19 13:04:10.984350: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-08-19 13:04:10.984355: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-08-19 13:04:10.984360: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-08-19 13:04:10.984365: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-08-19 13:04:10.984369: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-08-19 13:04:10.984374: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-08-19 13:04:10.984420: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-08-19 13:04:10.984445: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-08-19 13:04:10.984470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-19 13:04:10.984475: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0 
2021-08-19 13:04:10.984479: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N 
2021-08-19 13:04:10.984533: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6617 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3060 Ti, pci bus id: 0000:0a:00.0, compute capability: 8.6)
2021-08-19 13:04:10.984546: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-08-19 13:04:11.334311: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)

查看任务管理器时，我可以看到内存已完全分配，并且 3D 图形显示 99% 的利用率！与使用 CPU 相比，所需的训练时间减少了四分之一。总而言之，非常成功！

我现在真的希望运行我自己设计的 Conv2D 网络不会导致与示例相同的错误，但只有时间会证明一切，目前这对我来说已经足够好了目的。

无法将符号张量转换为 Numpy 数组（使用 RTX 30xx GPU）

Cannot convert a symbolic Tensor to Numpy array (using RTX 30xx GPU)

deep-learning

lstm

keras

tensorflow

最终确定答案：

如果您有 NVIDIA RTX 30xx GPU，那么您还没有完成！