当要求它适合模型时 Tensorflow 崩溃

Question

gpu 上的 Tensorflow 对我来说是新手，第一个天真的问题是，我是否正确地假设我可以使用 gpu (nv gtx 1660ti) 来运行 tensorflow ml 操作，同时它同时运行是我的显示器吗？我的电脑上只有一个 gpu 卡，假设它可以同时执行这两个操作，还是我需要一个专用的 gpu 仅用于 tensorflow，没有连接到任何显示器？

全部在 ubuntu 21.10 上，在 conda 环境中设置了 nvidia-toolkit、cudnn、tensorflow、tensorflow-gpu，一切似乎都工作正常：1 个 gpu 可见，使用 cudnn 11.6.r11 构建。 6、tf 版本 2.8.0，python 版本 3.7.10 都在 jupyter notebook 上的 conda env 运行ning 中。一切似乎运行都很好，直到我尝试训练模型然后收到此错误消息：

2022-03-19 04:42:48.005029: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8302

然后内核就锁定并崩溃了。顺便说一句，代码在安装 gpu 之前工作，当时它只是使用 cpu。这仅仅是 python、tensorflow、tensorflow-gpu、cudnn 版本之间某处的版本不匹配或更险恶的东西吗？谢谢。 J.

Answer 1

am I correct in assuming that I can use a GPU (nv gtx 1660ti) to run tensorflow ml operations, while it simultaneously runs my monitor?

是的，您可以在 ubuntu 上使用 nvidia-smi 检查您有多少可用内存或哪些进程正在使用 GPU。

Only have one GPU card in my pc, assume it can do both at the same? time

是的，可以。大多数人都这样做，GPU 上的训练过程类似于运行游戏，（但需要更多内存）

关于问题：

基于this版本table安装。

使用 nvidia-smi 检查您的驱动程序版本但是，对于真正的 Cuda 版本，请检查此 nvcc -V（nvidia-smi 中的 Cuda 版本实际上是支持的最大 Cuda 版本。）

只需安装 pip install tensorflow-gpu 这也会为您安装 keras。

检查 tensorflow 是否可以访问 GPU，如下所示：

import tensorflow as tf
tf.test.is_gpu_available() #should return True 
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

Answer 2

install based on this version table.

这对我来说很关键。有同样的问题，CPU 工作正常，GPU 会在模型拟合期间转储并带有退出代码但没有错误。该矩阵将向您显示 tensorflow 2.5 - 2.8 可与 CUDA 11.2 和 cudnn 8.1 一起使用，'latest' 版本为 11.5 和 8.4，截至 05/2022。我回滚了两个版本，一切正常。

Answer 3

The matrix will show you that tensorflow 2.5 - 2.8 work with CUDA 11.2 and cudnn 8.1

我认为问题是 CUDA 11.2 不适用于 Windows 11.

当要求它适合模型时 Tensorflow 崩溃

Tensorflow crashes when ask it to fit model

python

conda

jupyter

tensorflow

cudnn