MPI Send 给出分段错误

MPI Send is giving segmentation fault

我正在尝试 运行 具有 MPI(提升)的遗传算法,其中我必须将序列化对象从等级 0 发送到所有其他等级。但是当我尝试发送数据时出现分段错误错误。

这是我得到的代码、输出和错误。

代码:问题就在world.send(0, 0, newP);

int main (int argc, char** argv) 
{
    Population *pop = NULL;
    RuckSack r(true);
    int size, rank;
    Ga ga;
    namespace mpi = boost::mpi;
    mpi::environment env;
    mpi::communicator world;

    int countGeneration = 0;

    /* code */

    if (world.rank() == 0)
    {

        if (pop == NULL)
        {

            pop = new Population(60,true);
        }

    }

    for (int m = 0; m < 20; m++)
    {
        /* code */

        for (int i = 0; i< world.size(); i++)
        {
            world.send(i,0,pop);
        }


        world.recv(0, 0, pop);
        Population newP = *pop;


        newP = ga.evolvePopulation(newP, world.size());




        world.send(0, 0, newP);

    MPI_Finalize();

    return (EXIT_SUCCESS);
}

错误:

mpirun noticed that process rank 0 with PID 10336 on node user exited on signal 11 (Segmentation fault).

输出:

[user:10336] *** Process received signal ***
[user:10336] Signal: Segmentation fault (11)
[user:10336] Signal code: Address not mapped (1)
[user:10336] Failing at address: 0x31
[user:10336] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x35860)[0x7f1e93064860]
[user:10336] [ 1] /usr/lib/x86_64-linux-gnu/libboost_serialization.so.1.61.0(+0x14a24)[0x7f1e9409da24]
[user:10336] [ 2] /usr/lib/x86_64-linux-gnu/libboost_serialization.so.1.61.0(+0x15d11)[0x7f1e9409ed11]
[user:10336] [ 3] ./teste(+0x1de7c)[0x55ab4c07ae7c]
[user:10336] [ 4] ./teste(+0x1dd2c)[0x55ab4c07ad2c]
[user:10336] [ 5] ./teste(+0x1db3a)[0x55ab4c07ab3a]
[user:10336] [ 6] ./teste(+0x1d8eb)[0x55ab4c07a8eb]
[user:10336] [ 7] ./teste(+0x1d2da)[0x55ab4c07a2da]
[user:10336] [ 8] ./teste(+0x1cb20)[0x55ab4c079b20]
[user:10336] [ 9] ./teste(+0x1bed0)[0x55ab4c078ed0]
[user:10336] [10] ./teste(+0x1b47c)[0x55ab4c07847c]
[user:10336] [11] ./teste(+0x19741)[0x55ab4c076741]
[user:10336] [12] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f1e9304f3f1]
[user:10336] [13] ./teste(+0x112aa)[0x55ab4c06e2aa]
[user:10336] *** End of error message ***

以下是一些大胆的猜测:

  1. 您应该只在 rank0 进程上执行初始发送指令 - 现在您在所有没有意义的进程中执行它(并且可能是问题的原因)
  2. 您不应发送至 "self"。在你循环的第一次迭代中,rank0 向自己发送,afaik 将阻止等待 recv 的进程。但是由于 rank0 被阻止,它永远不会到达 'recv' 行并且将永远保持锁定状态。除此之外,进程向自身发送数据同样没有意义。

这些只是松散的建议,因为我在使用 MPI 方面的经验有限。希望对您有所帮助!