分配矩阵时出现 Fortran Seg Fault

Fortran Seg Fault when assigning Matrices

[更新] 更改了代码和几句话以反映我在第二条评论中解释的实现。代码应该用下面的行编译,但是,我有一个旧的 gfortran,可能没有看到你可能会看到的一些错误。

gfortran BLU_implementation_copy.f90 -o BLU_implementation_copy.x

我在 运行 使用 Fortran 90 时遇到难以置信的不一致段错误。

我的代码的总体 objective 是将一个随机生成的、复杂的、对称的、对角线占优的矩阵分解成可以轻松并行化的块。精简版(相关功能除外)见文末

我 运行在函数 mat2tiles 中将原始矩阵分解为块(块)时遇到错误。具体来说,这行代码一直失败:

o(i,j)=cell(x(a:min(i*Nblock,M), b:min(j*Nblock, N)))

这是在索引 i,j 处将输入矩阵 X 的一个块分配给矩阵 o。它实现了这一点,因为 o 的每个索引都是通过派生类型的矩阵 NblockxNblock:

type cell complex*16 :: block(Nblock,Nblock) end type cell

type(cell) :: o(M/Nblock+rem,N/Nblock+rem)

对于某些矩阵大小和图块大小,一切都完美无缺。对于较大的尺寸,错误开始出现在接近矩阵尺寸的图块尺寸中,并最终达到矩阵尺寸一半的尺寸。例如,我发现发生这种情况的最小实例是一个 20x20 矩阵,其图块大小为 18。但这不会一直发生。如果我尝试通常 运行 较小的矩阵并建立到那个大小,它 运行s。如果我做一个更大的尺寸来分割故障,然后 运行 20x20 和 18,它会分割故障。我发现的最小一致参数是 25x25,图块大小为 23。即使是这个大小,如果我在 o(i,j) 分配行之后打印出它失败的块:

print*, o(2,1)%block

它突然 运行 毫无问题地完成了整个过程。取出打印语句使其再次出现段错误。 最后,最烦人的一个[已修复 - 参考我的第二条评论],如果我做一个 2000x2000 的矩阵,瓦片大小为 1000 或更大,我在进入函数后立即遇到段错误(这意味着它发生在变量分配)。这使我相信问题可能源于矩阵的分配方式,尤其是因为我使用的是派生类型。但是当我尝试诊断矩阵的大小和内容时,一切似乎都很正常。

program BLU_implementation
    implicit none
    integer :: rem
    integer, parameter :: M=20, N=20, Nblock=19
    real*8 :: start, finish
    complex*16 :: A(M,N)

    type cell
    complex*16 :: block(Nblock,Nblock)
    end type cell

!determines if cell matrix doesn't need to be bigger
    rem=1
    if (modulo(M,Nblock)==0) then
        rem=0
    endif

    call cpu_time(start)
    call functionCalling(A, M, N, Nblock, rem)
    call cpu_time(finish)
    print*, 'overall time:', finish-start, 'seconds'

contains
!==================================================================================================================
subroutine functionCalling(A, M, N, Nblock, rem)
    implicit none
    integer :: IPIV, INFO, M, N, Nblock, rem
    real*8 :: start, finish
    complex*16 :: A(M,N)

    type(cell) :: C(M/Nblock+rem,N/Nblock+rem)

    call cpu_time(start)

    A= CSPDmatrixFill(A,M,N)

    call cpu_time(finish)
    print*, 'matrix fill time:', finish-start, 'seconds'


    call cpu_time(start)

    C= mat2tiles(A,M,N,Nblock,rem) 

    call cpu_time(finish)
    print*, 'tiling time:', finish-start, 'seconds'

end subroutine

!===================================================================================================================
! generates a complex, symmetric, positive-definite matrix (based off of another's code)

function CSPDmatrixFill(A, M, N) result (Matrix)

! initialization  
      implicit none
      integer :: i, j
      integer :: M, N
      real*8 :: x, xi
      complex*16 :: A(M, N), Matrix(M, N), EYE(M, N), MT(N, M)

      EYE=0
      forall(j=1:M) EYE(j,j)=1

! execution

      call random_seed

      do i=1, M
        do j=1, N
          call random_number (x)
            call random_number(xi)
            Matrix(i,j) = cmplx(x,xi)
       end do
      end do  

! construct a symmetric matrix ( O(n^2) )
      call Mtranspose(Matrix, M, N)
      Matrix = Matrix+MT  

! make positive definite (diagonally dominant)
        Matrix = Matrix + N*EYE

end function CSPDmatrixFill

!======================================================================================================
subroutine Mtranspose(A, i, j)
! takes a matrix and the two parameters used to make the matrix: A(i,j)
! returns a matrix with switched indices: A(j,i)
      implicit none

      integer :: i, j
      complex*16 :: A(i,j), MT(j,i)

      MT=A(j,i)
      return
end subroutine Mtranspose

!=======================================================================================================
!MAT2TILES - breaks up an array into a cell array of adjacent sub-arrays of equal sizes
!
! O=mat2tiles(X,M,N,Nblock)
!
!will produce a cell array o containing adjacent chunks of the array X(M,N)
!with each chunk of dimensions NblockxNblock. If Nblock does
!not divide evenly into size(X,i), then the chunks at the upper boundary of
!X along dimension i will have bogus values that do not affect the factorization
!in the places where the matrix doesn't occupy. (according to older versions. Might have changed with some edits)
!
function mat2tiles(X,M,N,Nblock,rem) result(o)

! initialization  
      implicit none
      integer :: a, b, i, j, M, N, Nblock, rem
      complex*16 :: X(M,N)

      type(cell) :: o(M/Nblock+rem,N/Nblock+rem)

! diagnostic print statements
        print*,size(o(1,1)%block), size(o(1,2)%block), size(o(2,1)%block), size(o(2,2)%block)
        print*, 'got to start'

! turn matrix x into cell matrix o

      do j=1, N/Nblock+rem
        if (j==1) then
          b=j
        else 
          b=b+Nblock
        endif       
        do i=1, M/Nblock+rem
          if (i==1) then
            a=i
          else 
            a=a+Nblock
          endif
! diagnostic print statement
          print*, 'writing to o: i:', i, 'j:', j, 'i*Nblock:', i*Nblock, 'j*Nblock:', j*Nblock, 'min of i:', min(i*Nblock, M), &
                    'min of j:', min(j*Nblock, N), 'a:', a, 'b:', b

          o(i,j)=cell(x(a:min(i*Nblock,M), b:min(j*Nblock, N))) 
        enddo
      enddo
! diagnostic print statement
      print*, 'got to end'
      return 
end function mat2tiles

!==================================================================================================
end program

在将问题缩小到使用数字与相同数字的变量之后,发现 Fort运行 不喜欢将矩阵分配给不同维度的矩阵,即使它适合内部。这很奇怪,因为 MNNblock 的较小值完美地解决了这个问题。

解决方案是简单地为 i=1j=1 定义 o(i,j)%block(1:Nblock,1:Nblock)=x(dim1:dim2,etc) 而不是 o(i,j)=cell(x(dim1:dim2,etc) 并且对 [=18= 的所有情况下的每个实例稍作修改] 和 j.

正确的代码(对于矩阵大小和分块大小的所有情况)如下所示:

if (i==M/Nblock+rem .AND. j==N/Nblock+rem .AND. rem==1) then 
    o(i,j)%block(1:M-(i-1)*Nblock,1:N-(j-1)*Nblock)=x(a:min(i*Nblock,M), b:min(j*Nblock, N))
else if (i==M/Nblock+rem .AND. j/=N/Nblock+rem .AND. rem==1) then
    o(i,j)%block(1:M-(i-1)*Nblock,1:Nblock)=x(a:min(i*Nblock,M), b:min(j*Nblock, N))
else if (i/=M/Nblock+rem .AND. j==N/Nblock+rem .AND. rem==1) then
    o(i,j)%block(1:Nblock,1:N-(j-1)*Nblock)=x(a:min(i*Nblock,M), b:min(j*Nblock, N))
else
    o(i,j)%block(1:Nblock,1:Nblock)=x(a:min(i*Nblock,M), b:min(j*Nblock, N))
end if

通过此更正,我的代码可以在新旧版本的 gfort运行 上运行。然而,有趣的是,Mac OS X 版本的 gfort运行 从未 运行 进入此段错误,只有 Linux 版本。