ICC compiler - error: parallel loop condition does not test loop control variable
ICC compiler - error: parallel loop condition does not test loop control variable
我正在尝试并行化我的 C/OpenMP 代码的 "for loop",在 Intel MIC (Xeon Phi) 卡上卸载调用后。我正在使用“#pragma omp parallel for”,当我将整数变量用作 "loop control variable" 时它编译良好。在我的代码中,我将浮点数组用作 "loop control variable",然后出现错误 "parallel loop condition does not test loop control variable".
没有错误的代码:
#define MAX_DIMENSIONS 10
unsigned long long i,y=0;
#pragma offload target(mic) in(i,y)
{
#pragma omp parallel for
for(i=0;i<10;i++)
/* code here */
}
错误代码:
#define MAX_DIMENSIONS 10
float x[MAX_DIMENSIONS];
unsigned long long i,y=0;
#pragma offload target(mic) in(x[MAX_DIMENSIONS],i,y)
{
#pragma omp parallel for
for(x[0]=0.000000; x[0]<10.000000; x[0]+=1.000000)
/* code here */
}
有什么方法可以在 "for loop" 中保留浮点数组符号,以便使用 OpenMP 成功进行并行化?
OpenMP 要求循环变量为整数类型:
http://www.openmp.org/wp-content/uploads/openmp-4.5.pdf#page=68
The syntax of the loop construct is as follows:
#pragma omp for ...
for-loops
...
Specifically, all associated for-loops must have canonical loop form (see Section 2.6 on page 53).
3 The iteration count for each associated loop is computed before entry to the outermost loop. If execution of any associated loop changes any of the values used to compute any of the iteration counts, then the behavior is unspecified.
6 The integer type (or kind, for Fortran) used to compute the iteration count for the collapsed loop is implementation defined.
您不能在openmp 循环构造中使用浮动类型的变量。你的第一个循环有 i
个正确的整数,第二个循环有 float
个不正确的类型变量。 Canonical loop form在“2.6 Canonical Loop Form”中定义 - http://www.openmp.org/wp-content/uploads/openmp-4.5.pdf#page=62 as
for (init-expr; test-expr; incr-expr) structured-block
...
var - One of the following:
* A variable of a signed or unsigned integer type.
* For C++, a variable of a random access iterator type.
* For C, a variable of a pointer type
incr-expr One of the following:
...
* var += incr
incr A loop invariant integer expression
并且您的第二个循环没有规范形式,无法并行。
#pragma omp parallel for
for(x[0]=0.000000; x[0]<10.000000; x[0]+=1.000000)
编译器很难用var和incr的浮点值提前得到循环迭代次数:一些十进制常量不能用浮点格式精确表示(例如0.2在浮点数中是0f3FC999999999999A ; 而 0.1 + 0.2 在许多语言中是 0.30000000000000004,检查 https://0.30000000000000004.com/).
您可以尝试整数数组或 long
s 或 long long
s:
#define MAX_DIMENSIONS 10
long long x[MAX_DIMENSIONS];
unsigned int i,y=0;
#pragma offload target(mic) in(x[MAX_DIMENSIONS],i,y)
{
#pragma omp parallel for
for(x[0]=0; x[0] < 10; x[0] += 1)
/* code here */
}
或者您可以尝试在循环之前为浮点范围估计正确的循环计数,然后在并行循环中使用整数迭代器作为 var 和 incr(确保进行正确的舍入)。
您可以像这样手动实现工作共享
#pragma omp parallel
{
float a = 0., b = 10.;
float step = 1.;
int N = (b-a)/step;
int ithread = omp_get_thread_num();
int nthreads = omp_get_num_threads();
float t0 = (ithread+0)*N/nthreads + a;
float t1 = (ithread+1)*N/nthreads + a;
for(float x=t0; x<t1; x += step) {
//code
}
}
这相当于
#pragma omp parallel for
for(float x=a; x<b; x += step)
如果 omp for
构造支持浮点迭代器。对于动态的其他时间表,您必须以不同的方式实施它们。请注意,并行代码和顺序代码可能不会给出相同的结果,例如如果 (b-a)/step
有小数部分(但 (10.-0)/1. = 10.
没问题)。因此,为了安全起见,最好更改您的代码以使用整数迭代器。
我正在尝试并行化我的 C/OpenMP 代码的 "for loop",在 Intel MIC (Xeon Phi) 卡上卸载调用后。我正在使用“#pragma omp parallel for”,当我将整数变量用作 "loop control variable" 时它编译良好。在我的代码中,我将浮点数组用作 "loop control variable",然后出现错误 "parallel loop condition does not test loop control variable".
没有错误的代码:
#define MAX_DIMENSIONS 10
unsigned long long i,y=0;
#pragma offload target(mic) in(i,y)
{
#pragma omp parallel for
for(i=0;i<10;i++)
/* code here */
}
错误代码:
#define MAX_DIMENSIONS 10
float x[MAX_DIMENSIONS];
unsigned long long i,y=0;
#pragma offload target(mic) in(x[MAX_DIMENSIONS],i,y)
{
#pragma omp parallel for
for(x[0]=0.000000; x[0]<10.000000; x[0]+=1.000000)
/* code here */
}
有什么方法可以在 "for loop" 中保留浮点数组符号,以便使用 OpenMP 成功进行并行化?
OpenMP 要求循环变量为整数类型:
http://www.openmp.org/wp-content/uploads/openmp-4.5.pdf#page=68
The syntax of the loop construct is as follows:
#pragma omp for ... for-loops
...
Specifically, all associated for-loops must have canonical loop form (see Section 2.6 on page 53).
3 The iteration count for each associated loop is computed before entry to the outermost loop. If execution of any associated loop changes any of the values used to compute any of the iteration counts, then the behavior is unspecified.
6 The integer type (or kind, for Fortran) used to compute the iteration count for the collapsed loop is implementation defined.
您不能在openmp 循环构造中使用浮动类型的变量。你的第一个循环有 i
个正确的整数,第二个循环有 float
个不正确的类型变量。 Canonical loop form在“2.6 Canonical Loop Form”中定义 - http://www.openmp.org/wp-content/uploads/openmp-4.5.pdf#page=62 as
for (init-expr; test-expr; incr-expr) structured-block ... var - One of the following: * A variable of a signed or unsigned integer type. * For C++, a variable of a random access iterator type. * For C, a variable of a pointer type incr-expr One of the following: ... * var += incr incr A loop invariant integer expression
并且您的第二个循环没有规范形式,无法并行。
#pragma omp parallel for
for(x[0]=0.000000; x[0]<10.000000; x[0]+=1.000000)
编译器很难用var和incr的浮点值提前得到循环迭代次数:一些十进制常量不能用浮点格式精确表示(例如0.2在浮点数中是0f3FC999999999999A ; 而 0.1 + 0.2 在许多语言中是 0.30000000000000004,检查 https://0.30000000000000004.com/).
您可以尝试整数数组或 long
s 或 long long
s:
#define MAX_DIMENSIONS 10
long long x[MAX_DIMENSIONS];
unsigned int i,y=0;
#pragma offload target(mic) in(x[MAX_DIMENSIONS],i,y)
{
#pragma omp parallel for
for(x[0]=0; x[0] < 10; x[0] += 1)
/* code here */
}
或者您可以尝试在循环之前为浮点范围估计正确的循环计数,然后在并行循环中使用整数迭代器作为 var 和 incr(确保进行正确的舍入)。
您可以像这样手动实现工作共享
#pragma omp parallel
{
float a = 0., b = 10.;
float step = 1.;
int N = (b-a)/step;
int ithread = omp_get_thread_num();
int nthreads = omp_get_num_threads();
float t0 = (ithread+0)*N/nthreads + a;
float t1 = (ithread+1)*N/nthreads + a;
for(float x=t0; x<t1; x += step) {
//code
}
}
这相当于
#pragma omp parallel for
for(float x=a; x<b; x += step)
如果 omp for
构造支持浮点迭代器。对于动态的其他时间表,您必须以不同的方式实施它们。请注意,并行代码和顺序代码可能不会给出相同的结果,例如如果 (b-a)/step
有小数部分(但 (10.-0)/1. = 10.
没问题)。因此,为了安全起见,最好更改您的代码以使用整数迭代器。