OpenCL 与 printf 的竞争条件?
OpenCL race condition with printf?
我目前正在尝试测试是否可以让一些基本操作(读取和写入内存)在 OpenCL 内核(英特尔 SDK)中工作。这是代码的一部分——省略了一些未使用的参数:
__kernel
void myfunc(__global char *buf_pw,
__global char *buf_hash)
{
int idx = get_global_id(0);
int a = 1 + 1;
char wololol[8] = "wololol";
if (idx == 0 )
{
buf_pw[0] = 'A';
buf_pw[1] = 'e';
buf_pw[2] = 'l';
buf_pw[3] = 'l';
buf_pw[4] = 'o';
buf_pw[5] = 0;
}
if (idx == 0)
{
while(buf_pw[0] != 'A');
printf("%c\n", buf_pw[0]);
printf("%c\n", buf_pw[1]);
printf("%c\n", buf_pw[2]);
printf("%c\n", buf_pw[3]);
printf("%c\n", buf_pw[4]);
printf("%c\n", buf_pw[5]);
printf("%s\n", buf_pw);
printf("%s\n", wololol);
}
printf("Hello World\n");
}
运行 程序多次会产生不同的结果。大多数时候,它会产生如下所示的输出:
A
e
l
l
o
(null)
wololol
Hello World
Hello World
Hello World
Hello World
另一种情况是:
A
e
l
l
o
Aello
wololol
Hello World
Hello World
Hello World
Hello World
我预计第二种情况是正确的输出,但它很少发生。是什么导致 writing/reading pw 行为异常?
我会小心使用 "printf" 函数,因为它可能不遵循 OpenCL 的正常逻辑。规范是这样说的:
When the event that is associated with a particular kernel invocation is completed, the output of all printf() calls executed by this kernel invocation is flushed to the implementation-defined output stream. Calling clFinish on a command queue flushes all pending output by printf in previously enqueued and completed commands to the implementation-defined output stream. In the case that printf is executed from multiple work-items concurrently, there is no guarantee of ordering with respect to written data. For example, it is valid for the output of a work-item with a global id (0,0,1) to appear intermixed with the output of a work-item with a global id (0,0,4) and so on.
您的代码似乎是有效的(尽管代码中的 while 循环暂时令人困惑!;),并且您对正确输出的期望是合理的。
您的 OpenCL 安装似乎有一个 bug/issue。我发现 AMD GPU OpenCL 驱动程序特别存在 printf 行为问题。
有问题的 printf 应该 always 打印 "Aello",并且 never 打印“(null)”,如您所料.
问题可能是由于 printf() 的供应商实现中的竞争条件造成的。
我目前正在尝试测试是否可以让一些基本操作(读取和写入内存)在 OpenCL 内核(英特尔 SDK)中工作。这是代码的一部分——省略了一些未使用的参数:
__kernel
void myfunc(__global char *buf_pw,
__global char *buf_hash)
{
int idx = get_global_id(0);
int a = 1 + 1;
char wololol[8] = "wololol";
if (idx == 0 )
{
buf_pw[0] = 'A';
buf_pw[1] = 'e';
buf_pw[2] = 'l';
buf_pw[3] = 'l';
buf_pw[4] = 'o';
buf_pw[5] = 0;
}
if (idx == 0)
{
while(buf_pw[0] != 'A');
printf("%c\n", buf_pw[0]);
printf("%c\n", buf_pw[1]);
printf("%c\n", buf_pw[2]);
printf("%c\n", buf_pw[3]);
printf("%c\n", buf_pw[4]);
printf("%c\n", buf_pw[5]);
printf("%s\n", buf_pw);
printf("%s\n", wololol);
}
printf("Hello World\n");
}
运行 程序多次会产生不同的结果。大多数时候,它会产生如下所示的输出:
A
e
l
l
o
(null)
wololol
Hello World
Hello World
Hello World
Hello World
另一种情况是:
A
e
l
l
o
Aello
wololol
Hello World
Hello World
Hello World
Hello World
我预计第二种情况是正确的输出,但它很少发生。是什么导致 writing/reading pw 行为异常?
我会小心使用 "printf" 函数,因为它可能不遵循 OpenCL 的正常逻辑。规范是这样说的:
When the event that is associated with a particular kernel invocation is completed, the output of all printf() calls executed by this kernel invocation is flushed to the implementation-defined output stream. Calling clFinish on a command queue flushes all pending output by printf in previously enqueued and completed commands to the implementation-defined output stream. In the case that printf is executed from multiple work-items concurrently, there is no guarantee of ordering with respect to written data. For example, it is valid for the output of a work-item with a global id (0,0,1) to appear intermixed with the output of a work-item with a global id (0,0,4) and so on.
您的代码似乎是有效的(尽管代码中的 while 循环暂时令人困惑!;),并且您对正确输出的期望是合理的。
您的 OpenCL 安装似乎有一个 bug/issue。我发现 AMD GPU OpenCL 驱动程序特别存在 printf 行为问题。
有问题的 printf 应该 always 打印 "Aello",并且 never 打印“(null)”,如您所料.
问题可能是由于 printf() 的供应商实现中的竞争条件造成的。