OpenCL allcoation 标志 CL_MEM_USE_HOST_PTR 用法未引用我的指针

Question

我试图将标志 CL_MEM_USE_HOST_PTR 与 OpenCL 函数 clCreateBuffer() 一起使用以避免多次内存分配。经过一点研究（逆向工程），我发现无论我使用什么标志，框架都会调用操作系统分配函数。

也许我的观念有误？但是从文档来看，它应该使用 DMA 访问主机内存而不是分配新内存。

我在英特尔设备 (HD5500) 上使用 opencl 1.2

Answer 1

在 Intel GPU 上，确保分配的主机指针与页面对齐且页面长度*。其实我觉得buffer size其实可以是cache line的偶数个，但是我总是四舍五入。

使用类似于：

void *host_ptr = _aligned_malloc(align_to(size,4096),4096));

Here's a good article for this: 在 "Key Takeaways".

If you already have the data and want to load the data into an OpenCL buffer object, then use CL_MEM_USE_HOST_PTR with a buffer allocated at a 4096 byte boundary (aligned to a page and cache line boundary) and a total size that is a multiple of 64 bytes (cache line size).

您也可以使用 CL_MEM_ALLOC_HOST_PTR 并让驱动程序处理分配。但是要获得指针，您必须对其进行映射和取消映射（但无需复制成本）。

OpenCL allcoation 标志 CL_MEM_USE_HOST_PTR 用法未引用我的指针

OpenCL allcoation flag CL_MEM_USE_HOST_PTR usage not referencing my pointer

gpu

intel

opencl