OpenCL 找不到 GPU 设备:NVIDIA GPU (Quadro K4000) + Visual Studio 2015

OpenCL cannot find GPU device: NVIDIA GPU (Quadro K4000) + Visual Studio 2015

刚开始学习 OpenCL 并使用 VS2015 设置了一个 Visual Studio 项目。不知何故,代码只能找到1个平台(我猜应该是CPU),找不到GPU设备。有人可以帮忙吗?详细信息如下:

  1. GPU:Nvidia Quadro K4000
  2. CUDA 安装

    CUDA is at: “C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5

    OpenCL related files are located at "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL" and "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\lib\Win32" (assuming 32bit system)

    The installer created two environment variables “CUDA_PATH” and “CUDA_PATH_V7_5”. They both point to the above location.

  3. 中Visual Studio,项目设置为

    "Project Properties" -> "C/C++" -> "Additional Include Directories" -> "$(CUDA_PATH)\include"

    "Project Properties" -> "Linker" -> "Additional Library Directories" -> "$(CUDA_PATH)\lib\Win32"

    "Project Properties" -> "Linker" -> "Input" -> "Additional Dependencies" -> "OpenCL.lib"

代码很简单:

#include "stdafx.h"
#include <iostream>
#include <CL/cl.h>
using namespace std;

int main()
{
    cl_int err;
    cl_uint numPlatforms;

    err = clGetPlatformIDs(0, NULL, &numPlatforms);

    if (CL_SUCCESS == err)
        cout << "Detected OpenCL platforms: " << numPlatforms  << endl;
    else
        cout << "Error calling clGetPlatformIDs. Error code:" << err << endl;


    cl_device_id device = NULL;
    err = clGetDeviceIDs(NULL, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
    if (err == CL_SUCCESS)
        cout << device << endl;

    return 0;
}

代码可以编译运行,但是GPU设备不行。具体来说,变量device的返回值为device = 0x00000000 <NULL>。会有什么问题?感谢您的帮助。

这不是您使用 OpenCL 的方式 API。

您需要获得一个有效的 cl_platform_id 对象,它需要用于检索 cl_device_id。你总是传递 NULL,这是行不通的。

第一次调用clGetPlatformIds是为了获取系统中的平台数。之后您需要再次调用该方法以检索实际的 cl_platform_ids:

size_t numPlatforms;
err = clGetPlatformIDs(0, NULL, &numPlatforms);
assert(numPlatforms > 0);
cl_platform_id platform_ids[numPlatforms];
err = clGetPlatformIDs(numPlatforms, platform_ids, NULL);

但是,如果您已经知道系统中将只有一个平台,那么您可以按如下方式进行加速,但一定要检查错误:

cl_platform_id platform_id;
err = clGetPlatformIDs(1, &platform_id, NULL);
assert(err == CL_SUCCESS);

获得平台后,您需要按照相同的步骤首先获得设备数量,然后检索 OpenCL 设备列表(然后您需要构建 cl_context、队列。 ..):

// Note: this has to be done for each `cl_platform_id`
// until you find the device you were looking for
size_t numDevices;
err = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_GPU, 0, NULL, &numDevices);
assert(numDevices > 0);
cl_device_id devices[numDevices];
err = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_GPU, numDevices, devices, NULL);

我想你现在已经明白这个过程了。如果像上面一样,你已经知道系统中只有1个GPU设备,你可以直接获取它的cl_device_id如下:

cl_device_id device;
err = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
assert(err == CL_SUCCESS);