细分 Scalapack 网格
Subdividing Scalapack grid
我正在尝试使用 Scalapack 计算许多大型矩阵的特征谱,而不是将每个矩阵分布在所有 32 个进程中。我宁愿将每个矩阵分布在 4 个进程中并并行计算 8 个矩阵。我知道如何使用 MPI_Comm_split 细分 MPI 网格,但我认为 Scalapack 不采用自定义通信器。相反,它似乎使用植根于 PVM 的 BLACS 网格。
如何在 Scalapack 中实现这种细分?
这是通过 BLACS
和网格设置完成的。
参考函数是
- BLACS_GRIDINIT( ICONTXT, ORDER, NPROW, NPCOL )
- BLACS_GRIDMAP( ICONTXT, USERMAP, LDUMAP, NPROW, NPCOL )
这些例程的文档指出:
These routines take the available processes, and assign, or map, them into a BLACS process grid.
Each BLACS grid is contained in a context (its own message passing universe), so that it does not interfere with distributed operations which occur within other grids/contexts.
These grid creation routines may be called repeatedly in order to define additional contexts/grids.
这意味着您可以创建 8 个不同的网格并将每个 ICONTXT
传递给每个矩阵的 scalapack 例程。
他们都得到一个IN/OUT参数
ICONTXT
(input/output) INTEGER
On input, an integer handle indicating the system context to be used in creating the BLACS context. The user may obtain a default system context via a call to BLACS_GET. On output, the integer handle to the created BLACS context.
您可以以相同的方式递归使用这些上下文。
我实施了@ztik 的建议,这是我得出的结果。似乎有效:
program main
use mpi
implicit none
integer :: ierr, me, nProcs, color, i,j,k, my_comm, dims(2), global_contxt
integer :: cnt, n_colors, map(2,2)
integer, allocatable :: contxts(:)
integer, parameter :: group_size = 4
call MPI_Init(ierr)
call MPI_Comm_size(MPI_COMM_WORLD, nProcs, ierr)
call MPI_Comm_rank(MPI_COMM_WORLD, me, ierr)
color = get_color(group_size)
n_colors = nProcs / group_size
allocate(contxts(n_colors))
dims = calc_2d_dim(group_size)
call BLACS_GET(0, 0, global_contxt)
if(me == 0) write (*,*) global_contxt
contxts = global_contxt
do k = 1,n_colors
! shift for each context
cnt = group_size * (k-1)
if(me==0) write (*,*) "##############", cnt
! create map
do i=1,2
do j=1,2
map(i,j) = cnt
cnt = cnt + 1
enddo
enddo
call BLACS_GRIDMAP(contxts(k), map, 2, 2, 2)
do i = 0,nProcs
if(i == me) then
write (*,*) me, contxts(k)
endif
call MPI_Barrier(MPI_COMM_WORLD, ierr)
enddo
enddo
call MPI_Finalize(ierr)
contains
function get_color(group_size) result(color)
implicit none
integer, intent(in) :: group_size
integer :: me, nProcs, color, ierr, i, cnt
call MPI_Comm_size(MPI_COMM_WORLD, nProcs, ierr)
call MPI_Comm_rank(MPI_COMM_WORLD, me, ierr)
if(mod(nProcs, group_size) /= 0) then
write (*,*) "Nprocs not divisable by group_size", mod(nProcs, group_size)
call MPI_Abort(MPI_COMM_WORLD, 0, ierr)
endif
color = 0
do i = 1,me
if(mod(i, group_size) == 0) then
color = color + 1
endif
enddo
end function get_color
function calc_2d_dim(sz) result(dim)
implicit none
integer, intent(in) :: sz
integer :: dim(2), cand
cand = nint(sqrt(real(sz)))
do while(mod(sz, cand) /= 0)
cand = cand - 1
enddo
dim(1) = sz/cand
dim(2) = cand
end function calc_2d_dim
end program main
我正在尝试使用 Scalapack 计算许多大型矩阵的特征谱,而不是将每个矩阵分布在所有 32 个进程中。我宁愿将每个矩阵分布在 4 个进程中并并行计算 8 个矩阵。我知道如何使用 MPI_Comm_split 细分 MPI 网格,但我认为 Scalapack 不采用自定义通信器。相反,它似乎使用植根于 PVM 的 BLACS 网格。
如何在 Scalapack 中实现这种细分?
这是通过 BLACS
和网格设置完成的。
参考函数是
- BLACS_GRIDINIT( ICONTXT, ORDER, NPROW, NPCOL )
- BLACS_GRIDMAP( ICONTXT, USERMAP, LDUMAP, NPROW, NPCOL )
这些例程的文档指出:
These routines take the available processes, and assign, or map, them into a BLACS process grid.
Each BLACS grid is contained in a context (its own message passing universe), so that it does not interfere with distributed operations which occur within other grids/contexts.
These grid creation routines may be called repeatedly in order to define additional contexts/grids.
这意味着您可以创建 8 个不同的网格并将每个 ICONTXT
传递给每个矩阵的 scalapack 例程。
他们都得到一个IN/OUT参数
ICONTXT
(input/output) INTEGER
On input, an integer handle indicating the system context to be used in creating the BLACS context. The user may obtain a default system context via a call to BLACS_GET. On output, the integer handle to the created BLACS context.
您可以以相同的方式递归使用这些上下文。
我实施了@ztik 的建议,这是我得出的结果。似乎有效:
program main
use mpi
implicit none
integer :: ierr, me, nProcs, color, i,j,k, my_comm, dims(2), global_contxt
integer :: cnt, n_colors, map(2,2)
integer, allocatable :: contxts(:)
integer, parameter :: group_size = 4
call MPI_Init(ierr)
call MPI_Comm_size(MPI_COMM_WORLD, nProcs, ierr)
call MPI_Comm_rank(MPI_COMM_WORLD, me, ierr)
color = get_color(group_size)
n_colors = nProcs / group_size
allocate(contxts(n_colors))
dims = calc_2d_dim(group_size)
call BLACS_GET(0, 0, global_contxt)
if(me == 0) write (*,*) global_contxt
contxts = global_contxt
do k = 1,n_colors
! shift for each context
cnt = group_size * (k-1)
if(me==0) write (*,*) "##############", cnt
! create map
do i=1,2
do j=1,2
map(i,j) = cnt
cnt = cnt + 1
enddo
enddo
call BLACS_GRIDMAP(contxts(k), map, 2, 2, 2)
do i = 0,nProcs
if(i == me) then
write (*,*) me, contxts(k)
endif
call MPI_Barrier(MPI_COMM_WORLD, ierr)
enddo
enddo
call MPI_Finalize(ierr)
contains
function get_color(group_size) result(color)
implicit none
integer, intent(in) :: group_size
integer :: me, nProcs, color, ierr, i, cnt
call MPI_Comm_size(MPI_COMM_WORLD, nProcs, ierr)
call MPI_Comm_rank(MPI_COMM_WORLD, me, ierr)
if(mod(nProcs, group_size) /= 0) then
write (*,*) "Nprocs not divisable by group_size", mod(nProcs, group_size)
call MPI_Abort(MPI_COMM_WORLD, 0, ierr)
endif
color = 0
do i = 1,me
if(mod(i, group_size) == 0) then
color = color + 1
endif
enddo
end function get_color
function calc_2d_dim(sz) result(dim)
implicit none
integer, intent(in) :: sz
integer :: dim(2), cand
cand = nint(sqrt(real(sz)))
do while(mod(sz, cand) /= 0)
cand = cand - 1
enddo
dim(1) = sz/cand
dim(2) = cand
end function calc_2d_dim
end program main