为什么我必须在析构函数中调用 MPI.Finalize() ?
Why do I have to call MPI.Finalize() inside the destructor?
我目前正在尝试了解 mpi4py。我设置 mpi4py.rc.initialize = False
和 mpi4py.rc.finalize = False
因为我不明白为什么我们想要自动初始化和完成。默认行为是在导入 MPI 时调用 MPI.Init()
。我认为这样做的原因是因为对于每个级别,python 解释器的一个实例正在 运行,并且这些实例中的每一个都将 运行 整个脚本,但这只是猜测。最后,我喜欢把它说清楚。
现在这引入了一些问题。我有这个代码
import numpy as np
import mpi4py
mpi4py.rc.initialize = False # do not initialize MPI automatically
mpi4py.rc.finalize = False # do not finalize MPI automatically
from mpi4py import MPI # import the 'MPI' module
import h5py
class DataGenerator:
def __init__(self, filename, N, comm):
self.comm = comm
self.file = h5py.File(filename, 'w', driver="mpio", comm=comm)
# Create datasets
self.data_ds= self.file.create_dataset("indices", (N,1), dtype='i')
def __del__(self):
self.file.close()
if __name__=='__main__':
MPI.Init()
world = MPI.COMM_WORLD
world_rank = MPI.COMM_WORLD.rank
filename = "test.hdf5"
N = 10
data_gen = DataGenerator(filename, N, comm=world)
MPI.Finalize()
这导致
$ mpiexec -n 4 python test.py
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort. [eu-login-04:01559] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort. [eu-login-04:01560] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
-------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort. [eu-login-04:01557] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
-------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was:
Process name: [[15050,1],3] Exit code: 1
--------------------------------------------------------------------------
我对这里发生的事情有点困惑。如果我将 MPI.Finalize()
移动到析构函数的末尾,它工作正常。
并不是说我也使用 h5py,它使用 MPI 进行并行化。所以我这里有一个并行文件 IO。并不是说 h5py 需要在 MPI 支持下编译。您可以通过设置虚拟环境并 运行ning pip install --no-binary=h5py h5py
.
轻松做到这一点
按照您的写法,data_gen
一直存在到主函数 returns。但是你在函数中调用 MPI.Finalize
。因此析构函数在 finalize 之后运行。 h5py.File.close
方法似乎在内部调用 MPI.Comm.Barrier
。禁止在 finalize 之后调用它。
如果你想有显式控制,请确保在调用 MPI.Finalize
之前销毁所有对象。当然,如果某些对象仅被垃圾收集器而不是引用计数器销毁,那么即使这样也可能不够。
为避免这种情况,请使用上下文管理器而不是析构函数。
class DataGenerator:
def __init__(self, filename, N, comm):
self.comm = comm
self.file = h5py.File(filename, 'w', driver="mpio", comm=comm)
# Create datasets
self.data_ds= self.file.create_dataset("indices", (N,1), dtype='i')
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
self.file.close()
if __name__=='__main__':
MPI.Init()
world = MPI.COMM_WORLD
world_rank = MPI.COMM_WORLD.rank
filename = "test.hdf5"
N = 10
with DataGenerator(filename, N, comm=world) as data_gen:
pass
MPI.Finalize()
我目前正在尝试了解 mpi4py。我设置 mpi4py.rc.initialize = False
和 mpi4py.rc.finalize = False
因为我不明白为什么我们想要自动初始化和完成。默认行为是在导入 MPI 时调用 MPI.Init()
。我认为这样做的原因是因为对于每个级别,python 解释器的一个实例正在 运行,并且这些实例中的每一个都将 运行 整个脚本,但这只是猜测。最后,我喜欢把它说清楚。
现在这引入了一些问题。我有这个代码
import numpy as np
import mpi4py
mpi4py.rc.initialize = False # do not initialize MPI automatically
mpi4py.rc.finalize = False # do not finalize MPI automatically
from mpi4py import MPI # import the 'MPI' module
import h5py
class DataGenerator:
def __init__(self, filename, N, comm):
self.comm = comm
self.file = h5py.File(filename, 'w', driver="mpio", comm=comm)
# Create datasets
self.data_ds= self.file.create_dataset("indices", (N,1), dtype='i')
def __del__(self):
self.file.close()
if __name__=='__main__':
MPI.Init()
world = MPI.COMM_WORLD
world_rank = MPI.COMM_WORLD.rank
filename = "test.hdf5"
N = 10
data_gen = DataGenerator(filename, N, comm=world)
MPI.Finalize()
这导致
$ mpiexec -n 4 python test.py
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort. [eu-login-04:01559] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort. [eu-login-04:01560] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
-------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort. [eu-login-04:01557] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
-------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was:
Process name: [[15050,1],3] Exit code: 1
--------------------------------------------------------------------------
我对这里发生的事情有点困惑。如果我将 MPI.Finalize()
移动到析构函数的末尾,它工作正常。
并不是说我也使用 h5py,它使用 MPI 进行并行化。所以我这里有一个并行文件 IO。并不是说 h5py 需要在 MPI 支持下编译。您可以通过设置虚拟环境并 运行ning pip install --no-binary=h5py h5py
.
按照您的写法,data_gen
一直存在到主函数 returns。但是你在函数中调用 MPI.Finalize
。因此析构函数在 finalize 之后运行。 h5py.File.close
方法似乎在内部调用 MPI.Comm.Barrier
。禁止在 finalize 之后调用它。
如果你想有显式控制,请确保在调用 MPI.Finalize
之前销毁所有对象。当然,如果某些对象仅被垃圾收集器而不是引用计数器销毁,那么即使这样也可能不够。
为避免这种情况,请使用上下文管理器而不是析构函数。
class DataGenerator:
def __init__(self, filename, N, comm):
self.comm = comm
self.file = h5py.File(filename, 'w', driver="mpio", comm=comm)
# Create datasets
self.data_ds= self.file.create_dataset("indices", (N,1), dtype='i')
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
self.file.close()
if __name__=='__main__':
MPI.Init()
world = MPI.COMM_WORLD
world_rank = MPI.COMM_WORLD.rank
filename = "test.hdf5"
N = 10
with DataGenerator(filename, N, comm=world) as data_gen:
pass
MPI.Finalize()