如何从代码的主机部分获取 GPU 的当前计算能力？

Question

我尝试使用 __CUDA_ARCH__ 但我在某处读到这仅适用于代码的设备部分。之后，我在 github 上看到了这段代码：link

有没有更好的方法来实现这个？

我问这个是因为我想确定（在主机代码上）GPU 是否支持统一内存，在这种情况下，将使用 cudaMallocManaged 或使用 cudaMallocs && cudaMemcpys。

我想做的事情的例子：

int main() {
  // IF CUDA >= 6.0 && COMPUTE CAPABILITY >= 3.0
      // USE cudaMallocManaged
  // ELSE
      // USE cudaMallocs && cudaMemcpys
  // END IF
  return 0;
}

Answer 1

这里好像涉及到两个问题：

我如何查询（在编译时）正在编译特定代码的 CUDA 运行time API 版本，以便我可以确定它是否是使用某些运行time API 元素（例如与托管内存关联的元素）是否安全，这些元素可能只出现在较新的运行time API 版本中？

已经讨论了一种方法。作为此特定案例的浓缩版，您可以执行以下操作：
```
#include <cuda_runtime_api.h>
...
// test for CUDA version of 6.0 or higher
#if CUDART_VERSION >= 6000 
// safe to use e.g. cudaMallocManaged() here
#else
// e.g. do not use managed memory API here
#endif
```
如何确定我是否可以在运行时使用托管内存？

正如评论中已经提到的那样，如果您已经确定正在编译的 CUDA 版本是 CUDA 6.0 或更高版本（例如，见上文），那么您应该在尝试使用之前测试对托管内存的支持 cudaMallocManaged 例如。 deviceQuery CUDA sample code indicates a general methodology (for example using cudaGetDeviceProperties，测试 managedMemSupported 属性) 的测试能力，在运行-time.

如何从代码的主机部分获取 GPU 的当前计算能力？

How do I get the current compute capability of a GPU from the host portion of the code?

cuda

compile-time

nvcc

preprocessor-directive

visual-studio-2017