如何解决 pgcc&openacc 链接器错误“__pgi_uacc_multicorestart”、“__pgi_uacc_multicoreend”

how to solve pgcc&openacc linker error "__pgi_uacc_multicorestart", "__pgi_uacc_multicoreend"

我正在尝试在 Ubuntu 16.04 LTS 上将我的 C 程序与 OpenACC 2.5 并行化。经过仅添加一行的简单修改后,我可以将所有 .c 文件编译为 .o 文件。在链接步骤中,pgcc 编译器显示

undefined reference to `__pgi_uacc_multicorestart'

undefined reference to `__pgi_uacc_multicoreend'

。 Google 搜索未显示与这些错误消息相关的任何内容。请帮我解决这个问题。

这里是我的系统和程序相关的资料和源代码。我尝试 post 基本部分,如果您需要其他任何内容,请告诉我。


OS,软件:

LSB Version:    core-9.20160110ubuntu0.2-amd64:core-9.20160110ubuntu0.2-noarch:printing-9.20160110ubuntu0.2-amd64:printing-9.20160110ubuntu0.2-noarch:security-9.20160110ubuntu0.2-amd64:security-9.20160110ubuntu0.2-noarch
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.3 LTS
Release:    16.04
Codename:   xenial

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.5' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5)

pgcc 17.10-0 64-bit target on x86-64 Linux -tp haswell 
PGI Compilers and Tools
Copyright (c) 2017, NVIDIA CORPORATION.  All rights reserved.

.bashrc:

#CUDA
export PATH=/usr/local/cuda/bin:$PATH;
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH;
#####
ulimit -s unlimited
#####
#Environment Modules
source /usr/share/modules/init/bash
module add /opt/pgi/modulefiles/pgi64/17.10
module add /opt/pgi/modulefiles/openmpi/2.1.2/2017
#####
#intel compiler
source /opt/intel/bin/compilervars.sh intel64
#intel vtune
source /opt/intel/vtune_amplifier/amplxe-vars.sh
#intel advisor
source /opt/intel/advisor/advixe-vars.sh
#intel inspector
source /opt/intel/inspector/inspxe-vars.sh
#intel mkl
source /opt/intel/mkl/bin/mklvars.sh intel64

生成文件:

CC = pgcc
CFLAGS_pgcc = -O0 -Minform=inform -Minfo -ta=multicore -g -pg -Mprof=time
CFLAGS = $(CFLAGS_$(CC)) -c
LFLAGS = $(LFLAGS_$(CC)) -L${MKLROOT}/lib/intel64 -lmkl_rt -lpthread -lm -ldl
IFLAGS = $(IFLAGS_$(CC)) -I${MKLROOT}/include

<content is partially neglected>

serial: $(C_OBJ)
        $(CC)  $(IFLAGS) $(CFLAGS) -c msg_ser.c
        $(CC)  $(IFLAGS) -o dplbe $(C_OBJ) msg_ser.o $(LFLAGS)

错误信息:

lbe.o: In function `equilibrium_distrib':
<content is partially neglected>lbe.c:548: undefined reference to `__pgi_uacc_multicorestart'
<content is partially neglected>lbe.c:583: undefined reference to `__pgi_uacc_multicoreend'
makefile:57: recipe for target 'serial' failed
make: *** [serial] Error 2

lbe.c,我只添加一行作为使用 OpenACC 的第一步。

#include "header.h"
extern int    max_x, max_y, max_z;
extern int    num_x, x_min, x_max;
extern int    num_proc, n_proc;
extern double tau[2], tau_v[2];

<content is partially neglected>

void equilibrium_distrib(int xy, int z, double ***velcs_df, double dt,
     struct vector forceDen,  struct vector *correctedVel, double *f_eq)
{

<content is partially neglected>

#pragma acc kernels
  {
  for(int q=0; q < 19; q++)
  {
    double term1 = (c_x[q] * correctedVel->x + c_y[q] * correctedVel->y + 
                    c_z[q] * correctedVel->z)*3.;
    double term2 = 0.5*term1*term1;
    f_eq[q] = weight[q]*density*(1 + term1 + term2 - term3); 
  }
  }  
}

将lbe.c编译为lbe.o消息:

pgcc-Warning--Mprof=time is not supported

PGC-I-0222-Redundant definition for symbol __THROW (/usr/include/x86_64-linux-gnu/sys/cdefs.h: 74)
PGC-I-0222-Redundant definition for symbol __extension__ (/usr/include/x86_64-linux-gnu/sys/cdefs.h: 358)
lbe_zcol:

<content is partially neglected>

equilibrium_distrib:
    558, FMA (fused multiply-add) instruction(s) generated
    559, FMA (fused multiply-add) instruction(s) generated
    560, FMA (fused multiply-add) instruction(s) generated
    565, FMA (fused multiply-add) instruction(s) generated
    566, FMA (fused multiply-add) instruction(s) generated
    567, FMA (fused multiply-add) instruction(s) generated
    573, FMA (fused multiply-add) instruction(s) generated
    577, Loop is parallelizable
         Generating Multicore code
        577, #pragma acc loop gang
    580, FMA (fused multiply-add) instruction(s) generated

您很可能在 link 行中遗漏了“-ta=multicore”。尝试将以下内容添加到您的 makefile 中:

LFLAGS_pgcc = -O0 -Minform=inform -Minfo -ta=multicore -g -pg 

请注意,不再支持“-Mprof”标志,因此应将其删除。

非常感谢您的有用提示。我去检查我的 makefile,发现它实际上有点混乱或者说已经过时了。旧 makefile 中的 "Variables Used by Implicit Rules" 格式不完全正确。这就是 pgcc 编译器和链接器无法正常工作的原因。

这里是自己新写的makefile,比较干净利落

CC = pgcc
CFLAGS = -I${MKLROOT}/include -O0 -Minform=inform -Minfo -ta=multicore -g -pg
LDFLAGS = -L${MKLROOT}/lib/intel64
LDLIBS = -lmkl_rt -lpthread -lm -ldl

C_OBJ = main.o driver.o update.o lbe_update.o \
    bnodes.o bnodes_init.o bnodes_dp.o implicit_force.o lbe.o modes_write.o \
    lub.o velcs_update.o hs3d.o n_list.o objects_init.o objects_map.o \
    clusters.o cluster_force.o cluster_update.o cj_grad.o \
    global_sums.o utils.o output.o \
    init_sphere.o ran_num.o get_forces.o verlet_update.o aggregation.o \
    jacobi_eigenvalue.o


clean:
    rm -f *.o dplbe

%.o : %.c
    $(CC) $(CFLAGS) -c $< -o $@ $(LDFLAGS) $(LDLIBS)

serial: $(C_OBJ)
    $(CC) $(CFLAGS) -c msg_ser.c $(LDFLAGS) $(LDLIBS)
    $(CC) $(CFLAGS) -o dplbe $(C_OBJ) msg_ser.o $(LDFLAGS) $(LDLIBS)