OpenCL 报告一半的预期计算单元

OpenCL reports half the expected compute units

我使用 OpenCL(在 Ubuntu 下)来查询可用的平台,得到一个平台,

CL_PLATFORM_PROFILE: FULL_PROFILE

CL_PLATFORM_VERSION: OpenCL 2.1 AMD-APP (3143.9)

CL_PLATFORM_NAME:AMD 加速并行处理

CL_PLATFORM_VENDOR:Advanced Micro Devices, Inc.

它提供一种设备,我用它查询:

cl_device_id device = devices[ j ];
cl_uint units = -1;
cl_device_type type;
size_t lmem = -1;
cl_uint dims = -1;
size_t wisz[ 3 ];
size_t wgsz = -1;
size_t gmsz = -1;
err = clGetDeviceInfo( device, CL_DEVICE_NAME, sizeof(name), name, 0 );
err = clGetDeviceInfo( device, CL_DEVICE_NAME, sizeof(vend), vend, 0 );
err = clGetDeviceInfo( device, CL_DEVICE_MAX_COMPUTE_UNITS, sizeof(units), &units, 0 );
err = clGetDeviceInfo( device, CL_DEVICE_TYPE, sizeof(type), &type, 0 );
err = clGetDeviceInfo( device, CL_DEVICE_LOCAL_MEM_SIZE, sizeof(lmem), &lmem, 0 );
err = clGetDeviceInfo( device, CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS, sizeof(dims), &dims, 0 );
err = clGetDeviceInfo( device, CL_DEVICE_MAX_WORK_ITEM_SIZES, sizeof(wisz), &wisz, 0 );
err = clGetDeviceInfo( device, CL_DEVICE_MAX_WORK_GROUP_SIZE, sizeof(wgsz), &wgsz, 0 );
CHECK_CL
err = clGetDeviceInfo( device, CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(gmsz), &gmsz, 0 );
CHECK_CL
if ( type == CL_DEVICE_TYPE_GPU )
        device_id = device;
printf( "  %s %s with [%d units] localmem=%zu globalmem=%zu dims=%d(%zux%zux%zu) max workgrp sz %zu", name, vend, units, lmem, gmsz, dims, wisz[0], wisz[1], wisz[2], wgsz );

这给了我:

gfx1012 gfx1012 with [11 units] localmem=65536 globalmem=8573157376 dims=3(1024x1024x1024) max workgrp sz 256

CL_DEVICE_MAX_COMPUTE_UNITS 11 的值让我担心。

我的系统配备了 Radeon RX 5500 XT,根据 AMD 网站和维基百科,它应该有 22 个计算单元。

为什么 OpenCL 报告的计算单元是预期数量的一半,即 11 个计算单元,而不是 22 个?

lspci 报告:

19:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 14 [Radeon RX 5500/5500M / Pro 5500M] (rev c5) (prog-if 00 [VGA controller])
        Subsystem: XFX Pine Group Inc. Navi 14 [Radeon RX 5500/5500M / Pro 5500M]
        Flags: bus master, fast devsel, latency 0, IRQ 83, NUMA node 0
        Memory at b0000000 (64-bit, prefetchable) [size=256M]
        Memory at c0000000 (64-bit, prefetchable) [size=2M]
        I/O ports at 7000 [size=256]
        Memory at c5d00000 (32-bit, non-prefetchable) [size=512K]
        Expansion ROM at c5d80000 [disabled] [size=128K]
        Capabilities: <access denied>
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu

并且安装了 AMD GPU PRO 驱动。

OpenGL vendor string: Advanced Micro Devices, Inc.
OpenGL renderer string: Radeon RX 5500 XT
OpenGL core profile version string: 4.6.14752 Core Profile Context 20.30
OpenGL core profile shading language version string: 4.60

对于 AMD RDNA GPU,具有 CL_DEVICE_MAX_COMPUTE_UNITS 的 OpenCL 报告双计算单元的数量(参见 RDNA whitepaper,第 4-9 页)。顾名思义,每个双计算单元包含 2 个计算单元。所以你的硬件和驱动安装没问题。