MPI_Gather 在最基本的代码中给出段错误
MPI_Gather gives seg fault in the most basic code
我正在开发一个更大的程序,我在其中遇到了 MPI_Gather。
我写了一个最小的示例代码,见下文。
program test
use MPI
integer :: ierr, rank, size
double precision, allocatable, dimension(:) :: send, recv
call MPI_Init(ierr)
call MPI_Comm_size(MPI_COMM_WORLD, size, ierr)
if (ierr /= 0) print *, 'Error in MPI_Comm_size'
call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierr)
if (ierr /= 0) print *, 'Error in MPI_Comm_size'
allocate(send(1), recv(size))
send(1) = rank
call MPI_Gather(send, 1, MPI_DOUBLE_PRECISION, &
recv, 1, MPI_DOUBLE_PRECISION, 0, MPI_COMM_WORLD)
print *, recv
call MPI_Finalize(ierr)
end program test
当(有 2 个节点)我得到以下错误输出。
[jorvik:13887] *** Process received signal ***
[jorvik:13887] Signal: Segmentation fault (11)
[jorvik:13887] Signal code: Address not mapped (1)
[jorvik:13887] Failing at address: (nil)
[jorvik:13888] *** Process received signal ***
[jorvik:13888] Signal: Segmentation fault (11)
[jorvik:13888] Signal code: Address not mapped (1)
[jorvik:13888] Failing at address: (nil)
[jorvik:13887] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36150) [0x7f6ab77f8150]
[jorvik:13887] [ 1] /usr/lib/libmpi_f77.so.0(PMPI_GATHER+0x12d) [0x7f6ab7ebca9d]
[jorvik:13887] [ 2] ./test() [0x4011a3]
[jorvik:13887] [ 3] ./test(main+0x34) [0x401283]
[jorvik:13887] [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f6ab77e376d]
[jorvik:13887] [ 5] ./test() [0x400d59]
[jorvik:13887] *** End of error message ***
[jorvik:13888] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36150) [0x7f0ca067d150]
[jorvik:13888] [ 1] /usr/lib/libmpi_f77.so.0(PMPI_GATHER+0x12d) [0x7f0ca0d41a9d]
[jorvik:13888] [ 2] ./test() [0x4011a3]
[jorvik:13888] [ 3] ./test(main+0x34) [0x401283]
[jorvik:13888] [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f0ca066876d]
[jorvik:13888] [ 5] ./test() [0x400d59]
[jorvik:13888] *** End of error message ***
我做错了什么?在我使用的机器上肯定安装了 MPI 并且 运行。
最大的问题是您没有在 MPI_Gather
的调用中包含最后一个参数 ierr。医生说
All MPI routines in Fortran (except for MPI_WTIME and MPI_WTICK) have an additional argument ierr at the end of the argument list.
除此之外,我的建议是始终坚持良好的做法:不要为变量使用内部函数名称,例如 size
。
program test
use MPI
integer :: ierr, rank, nProc
double precision, allocatable, dimension(:) :: send, recv
call MPI_Init(ierr)
if (ierr /= 0) print *, 'Error in MPI_Init'
call MPI_Comm_size(MPI_COMM_WORLD, nProc, ierr)
if (ierr /= 0) print *, 'Error in MPI_Comm_size'
call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierr)
if (ierr /= 0) print *, 'Error in MPI_Comm_size'
allocate(send(1), recv(nProc))
send(1) = rank
call MPI_Gather(send, 1, MPI_DOUBLE_PRECISION, &
recv, 1, MPI_DOUBLE_PRECISION, 0, MPI_COMM_WORLD, ierr)
if (ierr /= 0) print *, 'Error in MPI_Gather'
print *, recv
call MPI_Finalize(ierr)
end program test
您忘记将 return 错误代码添加到对 MPI_Gather
的调用中作为最后一个参数。 return 代码的值正在写入未映射的地址。
应该是
call MPI_Gather(send, 1, MPI_DOUBLE_PRECISION, &
recv, 1, MPI_DOUBLE_PRECISION, 0, MPI_COMM_WORLD, ierr)
ifort 在编译阶段捕捉到了这一点。看起来你的编译器(gfortran?)没有
我正在开发一个更大的程序,我在其中遇到了 MPI_Gather。
我写了一个最小的示例代码,见下文。
program test
use MPI
integer :: ierr, rank, size
double precision, allocatable, dimension(:) :: send, recv
call MPI_Init(ierr)
call MPI_Comm_size(MPI_COMM_WORLD, size, ierr)
if (ierr /= 0) print *, 'Error in MPI_Comm_size'
call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierr)
if (ierr /= 0) print *, 'Error in MPI_Comm_size'
allocate(send(1), recv(size))
send(1) = rank
call MPI_Gather(send, 1, MPI_DOUBLE_PRECISION, &
recv, 1, MPI_DOUBLE_PRECISION, 0, MPI_COMM_WORLD)
print *, recv
call MPI_Finalize(ierr)
end program test
当(有 2 个节点)我得到以下错误输出。
[jorvik:13887] *** Process received signal ***
[jorvik:13887] Signal: Segmentation fault (11)
[jorvik:13887] Signal code: Address not mapped (1)
[jorvik:13887] Failing at address: (nil)
[jorvik:13888] *** Process received signal ***
[jorvik:13888] Signal: Segmentation fault (11)
[jorvik:13888] Signal code: Address not mapped (1)
[jorvik:13888] Failing at address: (nil)
[jorvik:13887] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36150) [0x7f6ab77f8150]
[jorvik:13887] [ 1] /usr/lib/libmpi_f77.so.0(PMPI_GATHER+0x12d) [0x7f6ab7ebca9d]
[jorvik:13887] [ 2] ./test() [0x4011a3]
[jorvik:13887] [ 3] ./test(main+0x34) [0x401283]
[jorvik:13887] [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f6ab77e376d]
[jorvik:13887] [ 5] ./test() [0x400d59]
[jorvik:13887] *** End of error message ***
[jorvik:13888] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36150) [0x7f0ca067d150]
[jorvik:13888] [ 1] /usr/lib/libmpi_f77.so.0(PMPI_GATHER+0x12d) [0x7f0ca0d41a9d]
[jorvik:13888] [ 2] ./test() [0x4011a3]
[jorvik:13888] [ 3] ./test(main+0x34) [0x401283]
[jorvik:13888] [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f0ca066876d]
[jorvik:13888] [ 5] ./test() [0x400d59]
[jorvik:13888] *** End of error message ***
我做错了什么?在我使用的机器上肯定安装了 MPI 并且 运行。
最大的问题是您没有在 MPI_Gather
的调用中包含最后一个参数 ierr。医生说
All MPI routines in Fortran (except for MPI_WTIME and MPI_WTICK) have an additional argument ierr at the end of the argument list.
除此之外,我的建议是始终坚持良好的做法:不要为变量使用内部函数名称,例如 size
。
program test
use MPI
integer :: ierr, rank, nProc
double precision, allocatable, dimension(:) :: send, recv
call MPI_Init(ierr)
if (ierr /= 0) print *, 'Error in MPI_Init'
call MPI_Comm_size(MPI_COMM_WORLD, nProc, ierr)
if (ierr /= 0) print *, 'Error in MPI_Comm_size'
call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierr)
if (ierr /= 0) print *, 'Error in MPI_Comm_size'
allocate(send(1), recv(nProc))
send(1) = rank
call MPI_Gather(send, 1, MPI_DOUBLE_PRECISION, &
recv, 1, MPI_DOUBLE_PRECISION, 0, MPI_COMM_WORLD, ierr)
if (ierr /= 0) print *, 'Error in MPI_Gather'
print *, recv
call MPI_Finalize(ierr)
end program test
您忘记将 return 错误代码添加到对 MPI_Gather
的调用中作为最后一个参数。 return 代码的值正在写入未映射的地址。
应该是
call MPI_Gather(send, 1, MPI_DOUBLE_PRECISION, &
recv, 1, MPI_DOUBLE_PRECISION, 0, MPI_COMM_WORLD, ierr)
ifort 在编译阶段捕捉到了这一点。看起来你的编译器(gfortran?)没有