MPI-reduce 操作中的求和顺序

Order of summation in MPI-reduce operations


考虑使用 MPI_SUM 操作调用的 MPI 函数 MPI_reduce。

#include <mpi.h>
int MPI_Reduce(const void *sendbuf, void *recvbuf, int count,
               MPI_Datatype datatype, MPI_Op op, int root,
               MPI_Comm comm)


这就是我在 documentation


The ‘‘canonical’’ evaluation order of a reduction is determined by the ranks of the processes in the group. However, the implementation can take advantage of associativity, or associativity and commutativity, in order to change the order of evaluation.


actual standard 提供了一些进一步的见解:

Advice to implementors. It is strongly recommended that MPI_REDUCE be implemented so that the same result be obtained whenever the function is applied on the same arguments, appearing in the same order. Note that this may prevent optimizations that take advantage of the physical location of ranks. (End of advice to implementors.)


如果您每次 运行 时在节点和核心之间具有相同数量的列且物理位置相同,那么您可能每次都期望相同的结果(尽管如上所示,标准不保证这一点)。

实际上,在共享使用的 HPC 系统上,您通常不会获得完全相同的位置,因此归约顺序通常不同,并且由于归约操作的不同顺序,您会看到细微的差异。
