在本地安装 OpenMPI 以与 CUDA 一起使用时出现问题

Problems installing OpenMPI locally for use with CUDA

精简版

我有 CUDA 代码,我需要在本地 运行。因此,我正在尝试在 the OpenMPI directions. When I try to make my code, I receive very long error output, similar to what is described by the OpenMPI documentation 之后安装 OpenMPI。我尝试使用文档建议的修复方法重新安装 OpenMPI,但现在我在安装过程中遇到了这些错误:

Making all in tools/ompi_info
make[2]: Entering directory '/home/hatfull/Downloads/openmpi-2.1.1/ompi/tools/ompi_info'
  CC       ompi_info.o
  CC       param.o
  CCLD     ompi_info
ld: warning: libimf.so, needed by ../../../ompi/.libs/libmpi.so, not found (try using -rpath or -rpath-link)
ld: warning: libsvml.so, needed by ../../../ompi/.libs/libmpi.so, not found (try using -rpath or -rpath-link)
ld: warning: libirng.so, needed by ../../../ompi/.libs/libmpi.so, not found (try using -rpath or -rpath-link)
ld: warning: libintlc.so.5, needed by ../../../ompi/.libs/libmpi.so, not found (try using -rpath or -rpath-link)
ld: .libs/ompi_info: hidden symbol `__intel_cpu_features_init_x' in /opt/intel/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin/libirc.a(cpu_feature_disp.o) is referenced by DSO
ld: final link failed: Bad value
Makefile:1785: recipe for target 'ompi_info' failed
make[2]: *** [ompi_info] Error 1
make[2]: Leaving directory '/home/hatfull/Downloads/openmpi-2.1.1/ompi/tools/ompi_info'
Makefile:3353: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/hatfull/Downloads/openmpi-2.1.1/ompi'
Makefile:1806: recipe for target 'all-recursive' failed
make: *** [all-recursive] Error 1

我做错了什么?请帮忙! S.O.S.!

长版

我有 CUDA 代码,我需要在本地 运行。因此,我正在尝试在 the OpenMPI directions 之后安装 OpenMPI。我将 tar 存储为 ~/Downloads/openmpi-2.1.1.tar.gz。汉斯,我 运行

$gunzip -c openmpi-2.1.1.tar.gz | tar xf -
$cd openmpi-2.1.1
$./configure --prefix=/opt/openmpi-2.1.1 &> configure_log1
$sudo make all install &> install_log_take1

成功。请注意,我将方向中的 --prefix=/usr/local 更改为 --prefix=/opt/openmpi-2.1.1。我包括了 configure_log1 and install_log_take1.

当我尝试使用我的 makefile make 我的代码时 makefile.ulfhednar,

$make -f makefile.ulfhednar clean
$make -f makefile.ulfhednar &> make_log1

我在 make_log1 that looks similar to what is described by the OpenMPI documentation 中收到很长的错误输出。它说解决方案是使用配置选项“./configure CC=icc CXX=icpc F77=ifort FC=ifort ...”安装 OpenMPI,所以我使用命令重新安装,

$cd ~/Downloads/openmpi-2.1.1
$sudo make uninstall
$sudo rm -r /opt/openmpi-2.1.1
$cd ..
$sudo rm -r openmpi-2.1.1

$gunzip -c openmpi-2.1.1.tar.gz | tar xf -
$cd openmpi-2.1.1

$which icc
/opt/intel/compilers_and_libraries_2017.4.196/linux/bin/intel64/icc
$which icpc
/opt/intel/compilers_and_libraries_2017.4.196/linux/bin/intel64/icpc
$which ifort
/opt/intel/compilers_and_libraries_2017.4.196/linux/bin/intel64/ifort

$./configure --prefix=/opt/openmpi-2.1.1 CC=/opt/intel/compilers_and_libraries_2017.4.196/linux/bin/intel64/icc CXX=/opt/intel/compilers_and_libraries_2017.4.196/linux/bin/intel64/icpc F77=/opt/intel/compilers_and_libraries_2017.4.196/linux/bin/intel64/ifort FC=/opt/intel/compilers_and_libraries_2017.4.196/linux/bin/intel64/ifort &> configure_log2
$sudo make all install &> install_log_take2

这里是configure_log2 and install_log_take2install_log_take2 中值得注意的是以下几行:

Making all in tools/ompi_info
make[2]: Entering directory '/home/hatfull/Downloads/openmpi-2.1.1/ompi/tools/ompi_info'
  CC       ompi_info.o
  CC       param.o
  CCLD     ompi_info
ld: warning: libimf.so, needed by ../../../ompi/.libs/libmpi.so, not found (try using -rpath or -rpath-link)
ld: warning: libsvml.so, needed by ../../../ompi/.libs/libmpi.so, not found (try using -rpath or -rpath-link)
ld: warning: libirng.so, needed by ../../../ompi/.libs/libmpi.so, not found (try using -rpath or -rpath-link)
ld: warning: libintlc.so.5, needed by ../../../ompi/.libs/libmpi.so, not found (try using -rpath or -rpath-link)
ld: .libs/ompi_info: hidden symbol `__intel_cpu_features_init_x' in /opt/intel/compilers_and_libraries_2017.4.196/linux/compiler/lib/intel64_lin/libirc.a(cpu_feature_disp.o) is referenced by DSO
ld: final link failed: Bad value
Makefile:1785: recipe for target 'ompi_info' failed
make[2]: *** [ompi_info] Error 1
make[2]: Leaving directory '/home/hatfull/Downloads/openmpi-2.1.1/ompi/tools/ompi_info'
Makefile:3353: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/hatfull/Downloads/openmpi-2.1.1/ompi'
Makefile:1806: recipe for target 'all-recursive' failed
make: *** [all-recursive] Error 1

抱歉,我必须将所有日志文件上传到 mediafire,pastebin 不接受它们,因为它们太大了。

我做错了什么?请帮忙! S.O.S.!

我找到了解决办法!!!

我在没有使用 sudo 的情况下通过安装步骤登录了 root 和 运行。

#gunzip -c openmpi-2.1.1.tar.gz | tar xf -
#cd openmpi-2.1.1
#./configure --prefix=/opt/openmpi-2.1.1 CC=icc CXX=icpc FC=ifort
#make all install

我试图将 OpenMPI 与英特尔 Composer 编译器一起使用,这导致了权限问题。我按照 here 中的安装说明进行操作,但在尝试安装时遇到了与之前相同的问题。问题是链接器 ld "could not find" 正确的库,因为当调用 sudo 时,$LD_LIBRARY_PATH 变量中不再提供这些库。避免这种情况的唯一方法是登录 root 用户并将您的 $LD_LIBRARY_PATH 变量设置为与您的普通用户相同。

我已经在这个问题上坚持了,现在它已经解决了,我高兴得跳了起来!我希望这对以后的其他人有所帮助。