CUDA 启动内核失败:没有可执行的内核映像

CUDA failed to launch kernel : no kernel image available for execution

我正在尝试 运行 在相当旧的 GPU 上使用 CUDA。我尝试了 CUDA Samples vectorAdd,它给了我以下错误:

Failed to launch vectorAdd kernel (error code no kernel image is available for execution on the device)!

这些是

的输出
  1. 设备查询:
CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 580"
  CUDA Driver Version / Runtime Version          9.1 / 9.0
  CUDA Capability Major/Minor version number:    2.0
  Total amount of global memory:                 1467 MBytes (1538392064 bytes)
MapSMtoCores for SM 2.0 is undefined.  Default to use 64 Cores/SM
MapSMtoCores for SM 2.0 is undefined.  Default to use 64 Cores/SM
  (16) Multiprocessors, ( 64) CUDA Cores/MP:     1024 CUDA Cores
  GPU Max Clock rate:                            1630 MHz (1.63 GHz)
  Memory Clock rate:                             2050 Mhz
  Memory Bus Width:                              384-bit
  L2 Cache Size:                                 786432 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (65535, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 3 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime Version = 9.0, NumDevs = 1
Result = PASS
  1. nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.147                Driver Version: 390.147                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 580     Off  | 00000000:03:00.0 N/A |                  N/A |
| 42%   48C   P12    N/A /  N/A |    257MiB /  1467MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+
  1. nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

现根据CUDA兼容性PDF https://docs.nvidia.com/pdf/CUDA_Compatibility.pdf 我假设我具有从 CUDA 9.0.176 到 GPU 驱动程序的二进制兼容性。 对于计算能力支持,table 没有列出 390 驱动程序。 甚至可以在这个 GPU 上对 CUDA 进行编程,还是我应该买一个更新的?如果可能,我需要什么驱动程序和 CUDA 工具包版本的组合?

您使用的 GPU 是 Fermi class(计算能力 2.0)设备。 CUDA 9.0 于 2017 年 9 月发布时,CUDA 工具包提供了 officially removed 支持。具有 Fermi 支持的 CUDA 工具包的最后一个版本是 CUDA 8.0。如果您想将该 GPU 与 CUDA 一起使用,则必须使用它(或更旧的东西)。

[根据评论收集的答案并添加为社区 wiki 条目,以便将此问题从 CUDA 标签的未回答列表中移除]