非 OK 状态:GpuLaunchKernel(...) 状态:内部:没有可在设备上执行的内核映像

Non-OK-status: GpuLaunchKernel(...) status: Internal: no kernel image is available for execution on the device

我 运行 我在带有 CUDA 工具包 10.1 CUDNN 7.6.0 (Windows 10) 的 tensorflow 2.1.0 Anaconda 上的代码 returns 一个问题

F .\tensorflow/core/kernels/random_op_gpu.h:232] Non-OK-status: GpuLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: no kernel image is available for execution on the device

我的 GPU:GT940MX 计算能力 5.0

我已经 运行 nvcc -V 和它 returns :

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:26_Pacific_Standard_Time_2019
Cuda compilation tools, release 10.1, V10.1.105

这是完整的结果:

2020-08-05 10:05:48.368012: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 10:06:00.488544: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-08-05 10:06:48.153611: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce 940MX computeCapability: 5.0
coreClock: 0.8605GHz coreCount: 4 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 37.33GiB/s
2020-08-05 10:06:48.164731: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 10:06:48.245826: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-08-05 10:06:48.296245: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-08-05 10:06:48.338860: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-08-05 10:06:48.439393: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-08-05 10:06:48.489830: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-08-05 10:06:48.941872: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-08-05 10:06:48.946651: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-08-05 10:06:48.951881: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-08-05 10:06:48.979077: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x23d29b660d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-05 10:06:48.985680: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-08-05 10:06:48.990616: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce 940MX computeCapability: 5.0
coreClock: 0.8605GHz coreCount: 4 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 37.33GiB/s
2020-08-05 10:06:49.003356: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 10:06:49.009869: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-08-05 10:06:49.014858: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-08-05 10:06:49.020699: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-08-05 10:06:49.028876: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-08-05 10:06:49.033607: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-08-05 10:06:49.039192: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-08-05 10:06:49.045288: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-08-05 10:06:49.218497: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-05 10:06:49.223536: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0
2020-08-05 10:06:49.226857: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N
2020-08-05 10:06:49.230413: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1460 MB memory) -> physical GPU (device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0, compute capability: 5.0)
2020-08-05 10:06:49.244107: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x23d301b8fa0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-08-05 10:06:49.250377: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce 940MX, Compute Capability 5.0
2020-08-05 10:06:49.446601: F .\tensorflow/core/kernels/random_op_gpu.h:232] Non-OK-status: GpuLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: no kernel image is available for execution on the device

有什么问题以及如何解决?

根据下面的屏幕截图,Tensorflow Versions 2.1, 2.2 and 2.3 使用 cuDNN 版本 7.4cuDNN version of your GPU is 7.6.

这很可能就是错误的原因。

解决方法是降级 GPU.

cuDNN Version

cuDNN 的现有版本可以通过 Windows Control Panel 使用 Programs and Features widget.

可以安装新版本的 cuDNN,如图所示 NVIDIA Installation Guide

此外,请参阅此 Github Issue 以了解有关如何降级 cuDNN 版本的更多信息。

以上截图取自此Tensorflow Documentation

看起来这是 Python 3.8 和 Tensorflow 2.3 的问题。我用 python 3.7 尝试了 tensorflow 2.3.0,但是 returns python 3.7 出现错误,因为 python38.dll (我不记得确切的错误,我已经删除了env),无论如何我在 anaconda env 上使用 python 3.7 并使用 pip 安装了 tensorflow 2.1.0 并且它有效。

我也在 github 中发布了这个问题,这个问题在 github https://github.com/tensorflow/tensorflow/issues/42052

中得到了回答

我有同样的问题,我的cuDNN是8.0.2。 正如您所说,CUDA 10.1 没有 cuDNN 7.4。 所以,我尝试了用于 CUDA 10.1 的 cuDNN 7.5 并且它有效!!!! 希望我的经验可以帮助别人。 :)

似乎某些 cuDNN 仅受某些特定版本的 tensorflow 支持。

作为 Window 用户,我是这样做的:

  1. Check which version that which Tensorflow and CUDA version combinations are compatible(可以点击左侧其他OS)
  2. 正如 Rock Jefferson 评论的那样,您可以将 cuDNN 7.5 用于 CUDA 10.1。它对我有用。 Download here

试试吧。希望对你有用。