如何使用 mpi4py 创建用于在节点之间传递的结构

Question

我正在使用 mpi4py 来并行化我的代码。我想在节点之间传递两条数据，一个整数和一个实数。我还想使用更快的数组和大写 Send 和 Recv 函数。阅读一些教程，似乎应该可以做到，但我找不到任何示例。这是无效的简单版本：

import numpy
from mpi4py import MPI
comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()

dt = numpy.dtype('int,float')
if rank == 0:
    recvBuffr = numpy.zeros(1,dt)
    comm.Recv(recvBuffr, source = MPI.ANY_SOURCE)
    print recvBuffr

else:
    result = rank*1.5
    sendBuffr = numpy.zeros(1,dt)
    sendBuffr[0][0] = rank
    sendBuffr[0][1] = result
    comm.Send(sendBuffr, dest=0)

错误：

Traceback (most recent call last):
  File "mpitest.py", line 10, in <module>
Traceback (most recent call last):
  File "mpitest.py", line 18, in <module>
    comm.Send(sendBuffr, dest=0)
    comm.Recv(recvBuffr, source = MPI.ANY_SOURCE)
  File "MPI/Comm.pyx", line 248, in mpi4py.MPI.Comm.Recv (src/mpi4py.MPI.c:78963)
  File "MPI/Comm.pyx", line 237, in mpi4py.MPI.Comm.Send (src/mpi4py.MPI.c:78765)
  File "MPI/msgbuffer.pxi", line 380, in mpi4py.MPI.message_p2p_recv (src/mpi4py.MPI.c:26730)
  File "MPI/msgbuffer.pxi", line 366, in mpi4py.MPI._p_msg_p2p.for_recv (src/mpi4py.MPI.c:26575)
  File "MPI/msgbuffer.pxi", line 375, in mpi4py.MPI.message_p2p_send (src/mpi4py.MPI.c:26653)
  File "MPI/msgbuffer.pxi", line 358, in mpi4py.MPI._p_msg_p2p.for_send (src/mpi4py.MPI.c:26515)
  File "MPI/msgbuffer.pxi", line 114, in mpi4py.MPI.message_simple (src/mpi4py.MPI.c:23528)
  File "MPI/msgbuffer.pxi", line 114, in mpi4py.MPI.message_simple (src/mpi4py.MPI.c:23528)
  File "MPI/msgbuffer.pxi", line 59, in mpi4py.MPI.message_basic (src/mpi4py.MPI.c:22718)
KeyError: 'T{l:f0:d:f1:}'
  File "MPI/msgbuffer.pxi", line 59, in mpi4py.MPI.message_basic (src/mpi4py.MPI.c:22718)
KeyError: 'T{l:f0:d:f1:}'

我认为这意味着使用 numpy 结构化数组是不够的，我需要使用 MPI 数据类型。我在文档（https://mpi4py.scipy.org/docs/apiref/mpi4py.MPI.Datatype-class.html）上发现有一个函数mpi4py.MPI.Datatype.Create_struct，看起来可能是我想要的，但我不明白如何使用它。文档字符串说：

Create_struct(...)
    Datatype.Create_struct(type cls, blocklengths, displacements, datatypes)

    Create an datatype from a general set of
    block sizes, displacements and datatypes

感谢您的帮助！

Answer 1

如果发送的数据之一是整数，则可以将其作为标签发送。（但是，由于此解决方案仅限于整数，我仍然对我的问题的替代答案非常感兴趣。）

import numpy
from mpi4py import MPI
comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()

if rank == 0:
    result = numpy.zeros(1,float)
    status=MPI.Status()
    comm.Recv(result, source = MPI.ANY_SOURCE, status = status, tag = MPI.ANY_TAG)
    print status.Get_tag(), result

else:
    result = numpy.array([rank*1.5,])
    i = 5
    comm.Send(result, dest=0, tag=i)

Answer 2

所以从头开始：

只需使用 python 的元组就可以开始工作，而 MPI4PY 非常方便的 pickling 运算符只需发送一个元组即可完成此操作：

from __future__ import print_function
from  mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()

assert size > 1

if rank == 0:
    result = comm.recv(source = MPI.ANY_SOURCE, tag = MPI.ANY_TAG)
    print(result)
elif rank == 1:
    comm.send((1, 3.14), dest = 0)

运行给出

$ mpirun -np 2 python send_tuple.py
(1, 3.14)

但是消息两端的这个 pickling/unpickling 确实需要一些时间，所以一旦一切正常，通过定义结构类型在本机 MPI 中执行此操作肯定是一个可能的优化目标。

为此，您必须知道结构的内存布局，这通常不适用于（比如）元组； MPI4PY 中的大写消息运算符依赖于 numpy，它保证了内存布局。

对于结构数组之类的东西，你可以使用 numpy structured arrays:

>>> a = numpy.zeros(2, dtype=([('int',numpy.int32),('dbl',numpy.float64)]))
>>> a
array([(0, 0.0), (0, 0.0)],
      dtype=[('int', '<i4'), ('dbl', '<f8')])

所以现在我们有一个结构数组，第一个字段被命名为 'int' 并且具有 4 字节整数类型，第二个被命名为 'dbl' 并且具有 8 字节浮点型。

有了这个之后，您就可以开始查询数据布局 - 查找单个结构的大小：

>>> print(a.nbytes/2)
12
>>> print(a.dtype.fields)
mappingproxy({'dbl': (dtype('float64'), 4), 'int': (dtype('int32'), 0)})

首先告诉您类型的范围 - 第一个元素的开始和第二个元素的开始之间的字节数 - 第二个为您提供每个元素的偏移量（以字节为单位）。您需要的结构：

>>> displacements = [a.dtype.fields[field][1] for field in ['int','dbl']]
>>> print(displacements)
[0, 4]

现在您可以开始为该结构创建 MPI 数据类型并像使用 MPI.INT 等一样使用它。剩下的唯一技巧是，在调用 Create_struct 时，您必须将 numpy dtypes 转换为 MPI 数据类型，但这相当简单。以下代码为您提供了一个开始：

#!/usr/bin/env python
from __future__ import print_function
from  mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()

assert size > 1

def definetype(field_names, field_dtypes):
    num = 2
    dtypes = list(zip(field_names, field_dtypes))
    a = np.zeros(num, dtype=dtypes)

    struct_size = a.nbytes // num
    offsets = [ a.dtype.fields[field][1] for field in field_names ]

    mpitype_dict = {np.int32:MPI.INT, np.float64:MPI.DOUBLE}  #etc
    field_mpitypes = [mpitype_dict[dtype] for dtype in field_dtypes]

    structtype = MPI.Datatype.Create_struct([1]*len(field_names), offsets, field_mpitypes)
    structtype = structtype.Create_resized(0, struct_size)
    structtype.Commit()
    return structtype


if __name__ == "__main__":
    struct_field_names = ['int', 'dbl']
    struct_field_types = [np.int32, np.float64]
    mytype = definetype(struct_field_names, struct_field_types)
    data = np.zeros(1, dtype=(list(zip(struct_field_names, struct_field_types))))

    if rank == 0:
        comm.Recv([data, mytype], source=1, tag=0)
        print(data)
    elif rank == 1:
        data[0]['int'] = 2
        data[0]['dbl'] = 3.14
        comm.Send([data, mytype], dest=0, tag=0)

运行给出

$ mpirun -np 2 python send_struct.py
[(2, 3.14)]

如何使用 mpi4py 创建用于在节点之间传递的结构

How to create struct for passing between nodes using mpi4py

python

parallel-processing

numpy

mpi

mpi4py