OpenACC:如何将缩减子句应用于结构变量
OpenACC: How to apply the reduction clause to a struct variable
我只需要将 reduction(max: )
子句应用到 Dts->t
但似乎没有任何效果,我尝试了 reduction(max:Dts.t), reduction(max:Dts->t), reduction(max:Dts)
和 reduction(max:t)
.
#pragma acc parallel loop collapse(3) reduction(max:t) present(Dts)
for (k = KBEG; k <= KEND; k++){
for (j = JBEG; j <= JEND; j++){
for (i = IBEG; i <= IEND; i++){
Dts->t = MAX(Dts->t, C_dt[k][j][i]);
}}}
我遇到了这些类型的错误:
PGC-S-0035-Syntax error: Recovery attempted by replacing '.' by ',' (update_stage.c: 450)
PGC-S-0035-Syntax error: Recovery attempted by replacing identifier present by accparallel (update_stage.c: 450)
PGC-S-0040-Illegal use of symbol, invDt_hyp (update_stage.c: 450)
PGC-S-0036-Syntax error: Recovery attempted by inserting <nl> before keyword for (update_stage.c: 451)
PGC-S-0978-The clause parallel is deprecated; use clause gang instead (update_stage.c: 451)
PGC-S-0374-Clause gang(value) not allowed in #pragma acc parallel loop (update_stage.c: 451)
Dts 是 Step 类型的变量。
typedef struct Step_{
double *cmax;
double t;
.
.
.
} Step;
我试图加速的循环在主函数中调用的例程中。在主函数中,定义了 Dts,然后我写了
#pragma acc enter data create(Dts)
#pragma acc enter data copyin(Dts.t[:1])
根据 OpenACC 标准,缩减变量不能是复合变量的成员。解决此限制的最简单方法是使用局部标量变量,然后将结果分配回结构成员。
类似于:
double tmax;
...
tmax = Dts->t;
#pragma acc parallel loop collapse(3) reduction(max:tmax) present(Dts)
for (k = KBEG; k <= KEND; k++){
for (j = JBEG; j <= JEND; j++){
for (i = IBEG; i <= IEND; i++){
tmax = MAX(tmax, C_dt[k][j][i]);
}}}
Dts->t = tmax;
如果您需要设备上 Dts->t 的值,请在分配后将其添加到更新设备指令中,或者将“tmax”放入数据区域并将分配放入串行区域。
// best if you need the value of Dts->t on both the host and device
double tmax;
...
tmax = Dts->t;
#pragma acc parallel loop collapse(3) reduction(max:tmax) present(Dts)
for (k = KBEG; k <= KEND; k++){
for (j = JBEG; j <= JEND; j++){
for (i = IBEG; i <= IEND; i++){
tmax = MAX(tmax, C_dt[k][j][i]);
}}}
Dts->t = tmax;
#pragma acc update device(Dts->t)
或
// best if you only need the value of Dts->t on the device
double tmax;
...
#pragma acc data create(tmax)
{
#pragma acc serial present(Dts)
{
tmax = Dts->t;
}
#pragma acc parallel loop collapse(3) reduction(max:tmax) present(Dts)
for (k = KBEG; k <= KEND; k++){
for (j = JBEG; j <= JEND; j++){
for (i = IBEG; i <= IEND; i++){
tmax = MAX(tmax, C_dt[k][j][i]);
}}}
#pragma acc serial present(Dts)
{
Dts->t = tmax;
}
}
我只需要将 reduction(max: )
子句应用到 Dts->t
但似乎没有任何效果,我尝试了 reduction(max:Dts.t), reduction(max:Dts->t), reduction(max:Dts)
和 reduction(max:t)
.
#pragma acc parallel loop collapse(3) reduction(max:t) present(Dts)
for (k = KBEG; k <= KEND; k++){
for (j = JBEG; j <= JEND; j++){
for (i = IBEG; i <= IEND; i++){
Dts->t = MAX(Dts->t, C_dt[k][j][i]);
}}}
我遇到了这些类型的错误:
PGC-S-0035-Syntax error: Recovery attempted by replacing '.' by ',' (update_stage.c: 450)
PGC-S-0035-Syntax error: Recovery attempted by replacing identifier present by accparallel (update_stage.c: 450)
PGC-S-0040-Illegal use of symbol, invDt_hyp (update_stage.c: 450)
PGC-S-0036-Syntax error: Recovery attempted by inserting <nl> before keyword for (update_stage.c: 451)
PGC-S-0978-The clause parallel is deprecated; use clause gang instead (update_stage.c: 451)
PGC-S-0374-Clause gang(value) not allowed in #pragma acc parallel loop (update_stage.c: 451)
Dts 是 Step 类型的变量。
typedef struct Step_{
double *cmax;
double t;
.
.
.
} Step;
我试图加速的循环在主函数中调用的例程中。在主函数中,定义了 Dts,然后我写了
#pragma acc enter data create(Dts)
#pragma acc enter data copyin(Dts.t[:1])
根据 OpenACC 标准,缩减变量不能是复合变量的成员。解决此限制的最简单方法是使用局部标量变量,然后将结果分配回结构成员。
类似于:
double tmax;
...
tmax = Dts->t;
#pragma acc parallel loop collapse(3) reduction(max:tmax) present(Dts)
for (k = KBEG; k <= KEND; k++){
for (j = JBEG; j <= JEND; j++){
for (i = IBEG; i <= IEND; i++){
tmax = MAX(tmax, C_dt[k][j][i]);
}}}
Dts->t = tmax;
如果您需要设备上 Dts->t 的值,请在分配后将其添加到更新设备指令中,或者将“tmax”放入数据区域并将分配放入串行区域。
// best if you need the value of Dts->t on both the host and device
double tmax;
...
tmax = Dts->t;
#pragma acc parallel loop collapse(3) reduction(max:tmax) present(Dts)
for (k = KBEG; k <= KEND; k++){
for (j = JBEG; j <= JEND; j++){
for (i = IBEG; i <= IEND; i++){
tmax = MAX(tmax, C_dt[k][j][i]);
}}}
Dts->t = tmax;
#pragma acc update device(Dts->t)
或
// best if you only need the value of Dts->t on the device
double tmax;
...
#pragma acc data create(tmax)
{
#pragma acc serial present(Dts)
{
tmax = Dts->t;
}
#pragma acc parallel loop collapse(3) reduction(max:tmax) present(Dts)
for (k = KBEG; k <= KEND; k++){
for (j = JBEG; j <= JEND; j++){
for (i = IBEG; i <= IEND; i++){
tmax = MAX(tmax, C_dt[k][j][i]);
}}}
#pragma acc serial present(Dts)
{
Dts->t = tmax;
}
}