一个不安全的 MPI 非阻塞通信示例？

Question

我正在我的程序中实现 MPI 非阻塞通信。我在 MPI_Isend man_page 上看到，上面写着：

A nonblocking send call indicates that the system may start copying data out of the send buffer. The sender should not modify any part of the send buffer after a nonblocking send operation is called, until the send completes.

我的代码是这样工作的：

// send messages
if(s > 0){

    MPI_Requests s_requests[s];
    MPI_Status   s_status[s];

    for(int i = 0; i < s; ++i){

        // some code to form the message to send
        std::vector<doubel> send_info;

        // non-blocking send
        MPI_Isend(&send_info[0], ..., s_requests[i]);
    }

    MPI_Waitall(s, s_requests, s_status);
}

// recv info
if(n > 0){    // s and n will match

    for(int i = 0; i < n; ++i){

        MPI_Status status;

        // allocate the space to recv info
        std::vector<double> recv_info;

        MPI_Recv(&recv_info[0], ..., status)
    }

}

我的问题是：我是否修改了发送缓冲区，因为它们位于内部花括号中（send_info 向量在循环完成后被杀死）？因此，这不是一种安全的通信方式吗？虽然我的程序现在运行良好，但我仍然被怀疑。感谢你的回复。

Answer 1

在这个例子中我想强调两点。

第一个是我质疑的问题：发送缓冲区在MPI_Waitall之前被修改。原因就是吉尔斯所说的。并且解决方案可以在for loop之前分配一个大缓冲区，并在循环完成后使用MPI_Waitall或将MPI_Wait放入循环内。但是后一种在性能意义上相当于使用MPI_Send。

但是我发现如果简单的转为阻塞发送和接收，这样的通信方案会造成死锁。它类似于经典死锁：

if (rank == 0) {
      MPI_Send(..., 1, tag, MPI_COMM_WORLD);
      MPI_Recv(..., 1, tag, MPI_COMM_WORLD, &status);
 } else if (rank == 1) {
      MPI_Send(..., 0, tag, MPI_COMM_WORLD);
      MPI_Recv(..., 0, tag, MPI_COMM_WORLD, &status);
 }

并且可以找到解释 here。

我的程序可能会导致类似的情况：所有的处理器都调用了MPI_Send然后它就死锁了。

所以我的解决方案是使用大缓冲区并坚持使用非阻塞通信方案。

#include <vector>
#include <unordered_map>

// send messages
if(s > 0){

    MPI_Requests s_requests[s];
    MPI_Status   s_status[s];

    std::unordered_map<int, std::vector<double>> send_info;

    for(int i = 0; i < s; ++i){


        // some code to form the message to send
        send_info[i] = std::vector<double> ();

        // non-blocking send
        MPI_Isend(&send_info[i][0], ..., s_requests[i]);
    }

    MPI_Waitall(s, s_requests, s_status);
}

// recv info
if(n > 0){    // s and n will match

    for(int i = 0; i < n; ++i){

        MPI_Status status;

        // allocate the space to recv info
        std::vector<double> recv_info;

        MPI_Recv(&recv_info[0], ..., status)
    }

}

一个不安全的 MPI 非阻塞通信示例？

A not safe MPI non-blocking communication example?

c++

mpi

nonblocking