使用动态并行 (CUDA) 编译 .cu 文件

Question

我换了一个新的 GPU GeForce GTX 980 with cc 5.2，所以它必须支持动态并行。但是，我什至无法编译一个简单的代码（来自编程指南）。这里就不提供了（没必要，只是有一个全局内核调用另一个全局内核而已）。

1) 我使用VS2013进行编码。在 property pages -> CUDA C/C++ -> device 中，我将 code generation 属性更改为 compute_35,sm_35，这是输出：

1>------ Build started: Project: testCublas3, Configuration: Debug Win32 ------
1>  Compiling CUDA source file kernel.cu...
1>  
1>  C:\programs\misha\cuda\Projects\test projects\testCublas3\testCublas3>"C:\Program      Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -gencode=arch=compute_35,code=\"sm_35,compute_35\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin"  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include"  -G   --keep-dir Debug -maxrregcount=0  --machine 32 --compile -cudart static  -g   -DWIN32 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd  " -o Debug\kernel.cu.obj "C:\programs\misha\cuda\Projects\test projects\testCublas3\testCublas3\kernel.cu" 
1>C:/programs/misha/cuda/Projects/test projects/testCublas3/testCublas3/kernel.cu(13): error : kernel launch from __device__ or __global__ functions requires separate compilation mode
1>  kernel.cu
1>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\BuildCustomizations\CUDA 6.5.targets(593,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\bin\nvcc.exe" -gencode=arch=compute_35,code=\"sm_35,compute_35\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin"  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5\include"  -G   --keep-dir Debug -maxrregcount=0  --machine 32 --compile -cudart static  -g   -DWIN32 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd  " -o Debug\kernel.cu.obj "C:\programs\misha\cuda\Projects\test projects\testCublas3\testCublas3\kernel.cu"" exited with code 2.

我想，我需要另一个选项来进行此编译：-rdc=true，但我没有找到可以在 VS2013 中设置它的位置。

2) 当我将code generation属性设置为compute_52,sm_52时，出现错误：Unsupported gpu architecture 'compute_52'。但是我的抄送是5.2。所以我最多可以编译 3.5 cc 的代码？

谢谢

Answer 1

关于第1项，cuda动态并行要求separate compilation and linking (-rdc=true), as well as linking in of the device cudart libraries (-lcudadevrt). Dynamic parallelism that also uses CUBLAS will also require linking in the device CUBLAS library (-lcublas_device). Possibly the simplest way to define where all these should go in a visual studio project is to start by looking at a visual studio project for the device cublas sample。

关于第 2 项，您的 GTX 980 计算能力 5.2 未被识别的原因是您需要 cuda 6.5 工具包的最新更新，该更新可用 here。

（请注意，cublas_device 功能已从最新版本的 CUDA 中删除。）

使用动态并行 (CUDA) 编译 .cu 文件

compilation .cu files with Dynamic Parallelism(CUDA)

cuda

dynamic-parallelism