Keras、Tensorflow、CuDNN 初始化失败
Keras, Tensorflow, CuDDN fails to initialize
我有一台非常强大的 Windows PC (运行ning Windows 10),它有 112GB 内存、16 核和 3 X Geforce RTX2070(不支持 SLI 等。 ).它是 运行ning CuDNN 7.5 + Tensorflor 1.13 + Python 3.7
我的问题是我收到以下错误 - 每当我尝试 运行 Keras 模型进行训练或对矩阵进行预测时。一开始我认为只有当我同时 运行 多个程序时才会发生这种情况,但事实并非如此,现在当我只 运行 宁一个实例时我也收到错误Keras(经常 - 但不总是)
2019-06-15 19:33:17.878911: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created
TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with
6317 MB memory) -> physical GPU (device: 2, name: GeForce RTX 2070,
pci bus id: 0000:44:00.0, compute capability: 7.5) 2019-06-15
19:33:23.423911: I tensorflow/stream_executor/dso_loader.cc:152]
successfully opened CUDA library cublas64_100.dll locally 2019-06-15
19:33:23.744678: E tensorflow/stream_executor/cuda/cuda_blas.cc:510]
failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED 2019-06-15
19:33:23.748069: E tensorflow/stream_executor/cuda/cuda_blas.cc:510]
failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED 2019-06-15
19:33:23.751235: E tensorflow/stream_executor/cuda/cuda_blas.cc:510]
failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED 2019-06-15
19:33:25.267137: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334]
Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED 2019-06-15
19:33:25.270582: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334]
Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED Exception:
Failed to get convolution algorithm. This is probably because cuDNN
failed to initialize, so try looking to see if a warning log message
was printed above.
[[{{node conv2d_1/convolution}}]]
[[{{node dense_3/Sigmoid}}]]
将以下内容添加到您的代码中
from keras.backend.tensorflow_backend import set_session
import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True # dynamically grow the memory used on the GPU
config.log_device_placement = True # to log device placement (on which device the operation ran)
sess = tf.Session(config=config)
set_session(sess) # set this TensorFlow session as the default session for Keras
在Tensorflow 2.0及以上版本,可以通过以下方式解决:
os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'
或
physical_devices = tf.config.experimental.list_physical_devices('GPU')
if len(physical_devices) > 0:
tf.config.experimental.set_memory_growth(physical_devices[0], True)
我有一台非常强大的 Windows PC (运行ning Windows 10),它有 112GB 内存、16 核和 3 X Geforce RTX2070(不支持 SLI 等。 ).它是 运行ning CuDNN 7.5 + Tensorflor 1.13 + Python 3.7
我的问题是我收到以下错误 - 每当我尝试 运行 Keras 模型进行训练或对矩阵进行预测时。一开始我认为只有当我同时 运行 多个程序时才会发生这种情况,但事实并非如此,现在当我只 运行 宁一个实例时我也收到错误Keras(经常 - 但不总是)
2019-06-15 19:33:17.878911: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 6317 MB memory) -> physical GPU (device: 2, name: GeForce RTX 2070, pci bus id: 0000:44:00.0, compute capability: 7.5) 2019-06-15 19:33:23.423911: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library cublas64_100.dll locally 2019-06-15 19:33:23.744678: E tensorflow/stream_executor/cuda/cuda_blas.cc:510] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED 2019-06-15 19:33:23.748069: E tensorflow/stream_executor/cuda/cuda_blas.cc:510] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED 2019-06-15 19:33:23.751235: E tensorflow/stream_executor/cuda/cuda_blas.cc:510] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED 2019-06-15 19:33:25.267137: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED 2019-06-15 19:33:25.270582: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED Exception: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[{{node conv2d_1/convolution}}]] [[{{node dense_3/Sigmoid}}]]
将以下内容添加到您的代码中
from keras.backend.tensorflow_backend import set_session
import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True # dynamically grow the memory used on the GPU
config.log_device_placement = True # to log device placement (on which device the operation ran)
sess = tf.Session(config=config)
set_session(sess) # set this TensorFlow session as the default session for Keras
在Tensorflow 2.0及以上版本,可以通过以下方式解决:
os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'
或
physical_devices = tf.config.experimental.list_physical_devices('GPU')
if len(physical_devices) > 0:
tf.config.experimental.set_memory_growth(physical_devices[0], True)