使用 brew、git 或内置 mpi 的 MPI_Comm_rank() 中的 MPI 段错误

MPI Seg fault in MPI_Comm_rank() using brew, git, or built in mpi

我无法让 MPI 在我的 MacBook pro 上工作。特别是,当我尝试调用 MPI_Comm_rank() 时它会出现错误。这是一个示例程序:

#include "mpi.h"
#include <iostream>

int main(int argc, char *argv[]) {
    std::cout << "Entered main\n";

    // Initialize parallel
    int rank, numProcess;
    MPI_Status status;
    MPI_Init(&argc, &argv);
    std::cout << "Init\n";
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    std::cout << "Rank\n";
    MPI_Comm_size(MPI_COMM_WORLD, &numProcess);

    std::cout << "Initialized MPI; rank " << rank << "\n";

    MPI_Finalize();
    return 0;
}

它使用 mpic++ mpi_test.cpp -o mpi_test 编译得很好,但后来我尝试通过调用 mpirun -np 2 ./mpi_test 来 运行 它并得到以下错误:

Entered main
Entered main
Init
Init
[Bens-MacBook-Pro:22004] *** Process received signal ***
[Bens-MacBook-Pro:22004] Signal: Segmentation fault: 11 (11)
[Bens-MacBook-Pro:22004] Signal code: Address not mapped (1)
[Bens-MacBook-Pro:22004] Failing at address: 0x10000004c
[Bens-MacBook-Pro:22004] [ 0] 0   libsystem_platform.dylib            0x00007fffc0118b3a _sigtramp + 26
[Bens-MacBook-Pro:22004] [ 1] 0   ???                                 0x0000000113f92978 0x0 + 4630063480
[Bens-MacBook-Pro:22004] [ 2] 0   mpi_test                            0x00000001078e40d1 main + 81
[Bens-MacBook-Pro:22004] [ 3] 0   libdyld.dylib                       0x00007fffbff09235 start + 1
[Bens-MacBook-Pro:22004] *** End of error message ***
[Bens-MacBook-Pro:22005] *** Process received signal ***
[Bens-MacBook-Pro:22005] Signal: Segmentation fault: 11 (11)
[Bens-MacBook-Pro:22005] Signal code: Address not mapped (1)
[Bens-MacBook-Pro:22005] Failing at address: 0x10000004c
[Bens-MacBook-Pro:22005] [ 0] 0   libsystem_platform.dylib            0x00007fffc0118b3a _sigtramp + 26
[Bens-MacBook-Pro:22005] [ 1] 0   ???                                 0x000000004fc26c50 0x0 + 1338141776
[Bens-MacBook-Pro:22005] [ 2] 0   mpi_test                            0x000000010ffd90d1 main + 81
[Bens-MacBook-Pro:22005] [ 3] 0   libdyld.dylib                       0x00007fffbff09235 start + 1
[Bens-MacBook-Pro:22005] [ 4] 0   ???                                 0x0000000000000001 0x0 + 1
[Bens-MacBook-Pro:22005] *** End of error message ***
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 0 on node Bens-MacBook-Pro exited on signal 11 (Segmentation fault: 11).
--------------------------------------------------------------------------

注意它通过了 MPI_Init(),但在 MPI_Comm() 中失败了。

当我 (i) 从 git 下载最新的 OpenMpi 库并直接安装它,以及 (ii) 通过 brew 安装 OpenMPI 时,我得到 same 错误.

问题似乎是由于 $PATH 中的 OpenMPI 版本冲突造成的。当我重置 $PATH 变量并专门使用 brew install 进行编译和 运行 时,一切正常。那时我从我的设备中卸载并删除了所有 mpi 库,然后重新安装并链接了 brew 版本。现在一切正常。发布以防其他人遇到这个问题。