切换到 MPI_LONG_LONG_INT 崩溃?
switching to MPI_LONG_LONG_INT crashes?
我有以下代码,其中 'all gathered' 值来自所有进程
nbodiesPerProc
。
int nBodies = 10;
std::vector<int> nbodiesPerProc(m_processes);
int err = MPI_Allgather(&nBodies,1,MPI_INT,&nbodiesPerProc[0], 1, MPI_INT, m_comm);
ASSERTMPIERROR(err, "gather");
我一将代码更改为 MPI_LONG_LONG_INT 它就开始崩溃:
std::size_t nBodies = 10;
static_assert( sizeof(std::size_t) == 8, "We send an 64bit integer");
std::vector<std::size_t> nbodiesPerProc(m_processes);
int err = MPI_Allgather(&nBodies,1,MPI_LONG_LONG_INT,&nbodiesPerProc[0]
,1, MPI_LONG_LONG_INT, m_comm);
ASSERTMPIERROR(err, "gather");
有人知道吗?
我需要注册吗MPI_LONG_LONG_INT?
崩溃:
[zfmgpu:17069] Signal: Segmentation fault (11)
[zfmgpu:17069] Signal code: Address not mapped (1)
[zfmgpu:17069] Failing at address: 0x10
[zfmgpu:17070] *** Process received signal ***
[zfmgpu:17070] Signal: Segmentation fault (11)
[zfmgpu:17070] Signal code: Address not mapped (1)
[zfmgpu:17070] Failing at address: 0x18
[zfmgpu:17067] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7fe4626a6340]
[zfmgpu:17067] [ 1] /lib/x86_64-linux-gnu/libc.so.6(+0x981c0) [0x7fe4609641c0]
[zfmgpu:17067] [ 2] /usr/lib/libmpi.so.1(+0x10362d) [0x7fe46162d62d]
[zfmgpu:17067] [ 3] /usr/lib/libmpi.so.1(ompi_datatype_sndrcv+0x502) [0x7fe46158e392]
[zfmgpu:17067] [ 4] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_allgather_intra_recursivedoubling+0x91) [0x7fe459aa0081]
[zfmgpu:17067] [ 5] /usr/lib/libmpi.so.1(PMPI_Allgather+0x179) [0x7fe46158f0c9]
更新:MPI_UNSIGNED_LONG_LONG也没有帮助,这将是正确的 64 位类型
发现错误:
m_processes 是 = 0
(愚蠢)
我有以下代码,其中 'all gathered' 值来自所有进程
nbodiesPerProc
。
int nBodies = 10;
std::vector<int> nbodiesPerProc(m_processes);
int err = MPI_Allgather(&nBodies,1,MPI_INT,&nbodiesPerProc[0], 1, MPI_INT, m_comm);
ASSERTMPIERROR(err, "gather");
我一将代码更改为 MPI_LONG_LONG_INT 它就开始崩溃:
std::size_t nBodies = 10;
static_assert( sizeof(std::size_t) == 8, "We send an 64bit integer");
std::vector<std::size_t> nbodiesPerProc(m_processes);
int err = MPI_Allgather(&nBodies,1,MPI_LONG_LONG_INT,&nbodiesPerProc[0]
,1, MPI_LONG_LONG_INT, m_comm);
ASSERTMPIERROR(err, "gather");
有人知道吗? 我需要注册吗MPI_LONG_LONG_INT?
崩溃:
[zfmgpu:17069] Signal: Segmentation fault (11)
[zfmgpu:17069] Signal code: Address not mapped (1)
[zfmgpu:17069] Failing at address: 0x10
[zfmgpu:17070] *** Process received signal ***
[zfmgpu:17070] Signal: Segmentation fault (11)
[zfmgpu:17070] Signal code: Address not mapped (1)
[zfmgpu:17070] Failing at address: 0x18
[zfmgpu:17067] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7fe4626a6340]
[zfmgpu:17067] [ 1] /lib/x86_64-linux-gnu/libc.so.6(+0x981c0) [0x7fe4609641c0]
[zfmgpu:17067] [ 2] /usr/lib/libmpi.so.1(+0x10362d) [0x7fe46162d62d]
[zfmgpu:17067] [ 3] /usr/lib/libmpi.so.1(ompi_datatype_sndrcv+0x502) [0x7fe46158e392]
[zfmgpu:17067] [ 4] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_allgather_intra_recursivedoubling+0x91) [0x7fe459aa0081]
[zfmgpu:17067] [ 5] /usr/lib/libmpi.so.1(PMPI_Allgather+0x179) [0x7fe46158f0c9]
更新:MPI_UNSIGNED_LONG_LONG也没有帮助,这将是正确的 64 位类型
发现错误:
m_processes 是 = 0
(愚蠢)