身份 Fortran DGEMM 上的 Lapack 矩阵乘法
Lapack Matrix multiplication on identity Fortran DGEMM
我遇到了一个问题,通过 Lapack 在单位矩阵上乘法非空矩阵给我一个空矩阵。所有矩阵都带有正元素
Dimensions of the matrices:
W = M1*N1
D = N1*N1
M = M1*M1
D M -单位矩阵
我尝试做的是获得 D*W'*M 的乘法,其中 W' 是 W 的转置。这是使用 DGEMM 操作的 Fortran 代码
PROGRAM SIRT
DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:) :: W, D, M, C, MAIN
DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:) :: b, x
DOUBLE PRECISION :: pho
INTEGER, PARAMETER :: M1 = 27000
INTEGER, PARAMETER :: N1 = 1000
INTEGER, PARAMETER :: num_iterations = 200
INTEGER :: i, j, k
allocate(W(1:M1,1:N1))
allocate(D(1:N1,1:N1))
allocate(M(1:M1,1:M1))
allocate(C(1:N1,1:M1))
allocate(MAIN(1:N1,1:M1))
allocate(b(1:M1))
allocate(x(1:N1))
D = 0
M = 0
DO i=1,N1
D(i,i) = 1
END DO
DO i=1,M1
M(i,i) = 1
END DO
OPEN(UNIT=11, FILE="Wmpi.txt")
DO i = 1,M1
READ(11,*) (W(i,j),j=1,N1)
END DO
print *,ANY(W>0)
CLOSE (11, STATUS='KEEP')
OPEN(UNIT=11, FILE="bmpi.txt")
DO i = 1,M1
READ(11,*) b(i)
END DO
CLOSE (11, STATUS='KEEP')
CALL DGEMM('N', 'T', N1, N1, N1, 1.0, D, N1, W, N1, 0.0, C, N1)
print *,ANY(C>0)
CALL DGEMM('N', 'N', N1, M1, M1, 1.0, C, N1, M, M1, 0.0, MAIN, N1)
print *,ANY(MAIN>0)
pho = DLANGE('F', N1, M1, C, N1, x)
END PROGRAM SIRT
顺序打印的答案是True,True,False。所以第一个乘法有效,我得到的不是空矩阵,但在第二个中所有元素都是 0.
我知道我不需要对单位矩阵进行矩阵乘法,但我想弄清楚如果我这样做会出现什么问题。
另一个问题,我可以在没有临时矩阵 Main 和 C 的情况下提高内存效率吗?
编辑
想出在所有元素都为空的第一次乘法后下载结果矩阵。无法理解为什么 ANY(C>0) 在第二阶段为真。
首先请注意 DGEMM 是 BLAS 库的一部分,而不是 LAPACK - 后者是更高级别的库。
您在调用 DGEMM 时遇到了一些问题
- 真正的常量是有种类的,你必须提供正确的种类。在
特别是 DGEMM 期望在老式 Fortran 中被称为
双精度。您提供的常量是默认类型的实数。
这会导致错误,我强烈建议提供
如果您的程序需要精度,则对所有实常数都友好
不是默认精度
- 在前三个整数中,前两个总是形状
结果矩阵。 C 是 n1xm1,因此更改为第一次调用
DGEMM
- 矩阵的前导维度在 99.999% 的情况下是第一个
尺寸为 allocated/declared - 它与
完全没有数学,这纯粹是为了让 DGEMM 可以计算出矩阵是怎样的
布置在记忆中。在第一个 DGEMM
中 W 是错误的
我还建议,由于使用 MATUL 内在函数可以很容易地正确检查结果,而不是像原来那样半心半意地检查结果,并且避免使用单位矩阵进行测试,除非你真的需要单位矩阵,因为高对称性和大量零点很容易掩盖错误。
将所有这些放在一起,删除我们不相关的部分并将您的程序修改为稍微更现代的形式我得到
Program sirt
Integer, Parameter :: wp = Kind( 1.0d0 )
Real( wp ),Allocatable,Dimension(:,:) :: w, d, m, c, main, main_compare
Real( wp ),Allocatable,Dimension(:) :: b, x
! Change problem size to something manageable on my laptop
!!$ Integer, Parameter :: m1 = 27000
!!$ Integer, Parameter :: n1 = 1000
Integer, Parameter :: m1 = 2700
Integer, Parameter :: n1 = 100
!!$ Integer :: i
Allocate(w(1:m1,1:n1))
Allocate(d(1:n1,1:n1))
Allocate(m(1:m1,1:m1))
Allocate(c(1:n1,1:m1))
Allocate(main(1:n1,1:m1))
Allocate(main_compare(1:n1,1:m1))
Allocate(b(1:m1))
Allocate(x(1:n1))
d = 0.0_wp
m = 0.0_wp
! Don't trust unit matrices for tests, too much symmetry, too many zeros - use Random numbers
!!$ Do i=1,n1
!!$ d(i,i) = 1.0_wp
!!$ End Do
Call Random_number( d )
!!$
!!$ Do i=1,m1
!!$ m(i,i) = 1.0_wp
!!$ End Do
Call Random_number( m )
! Don't have the file - use random numbers
!!$ open(unit=11, file="wmpi.txt")
!!$ do i = 1,m1
!!$ read(11,*) (w(i,j),j=1,n1)
!!$ end do
!!$ print *,any(w>0)
!!$ close (11, status='keep')
Call random_Number( w )
! 1) Real constant have kind, and you must provide the correct kind
! 2) Of the first three constant the first two are always the shape
! of the result matrix. C is n1xm1, hece the change
! 3) The leading dimension of a matrix is 99.999% of the time
! the first dimension as allocated/declared -it has nothing
! to do with the maths at all. It was wrong for W in the
! first matmul
! 4) DGEMM is part of the BLAS library - not LAPACK
!!$ CALL DGEMM('N', 'T', N1, N1, N1, 1.0, D, N1, W, N1, 0.0, C, N1)
!!$ CALL DGEMM('N', 'N', N1, M1, M1, 1.0, C, N1, M, M1, 0.0, MAIN, N1)
Call dgemm('n', 't', n1, m1, n1, 1.0_wp, d, n1, w, m1, 0.0_wp, c, n1)
Call dgemm('n', 'n', n1, m1, m1, 1.0_wp, c, n1, m, m1, 0.0_wp, main, n1)
! 5) Don't do half hearted checks on the results when proper checks are easy
main_compare = Matmul( Matmul( d, Transpose( w ) ), m )
Write( *, * ) 'Max error ', Maxval( Abs( main - main_compare ) )
! Check we haven't somehow managed all zeros in both matrices ...
Write( *, * ) main( 1:3, 1 )
Write( *, * ) main_compare( 1:3, 1 )
End Program sirt
ian@eris:~/work/stack$ gfortran-8 -std=f2008 -fcheck=all -Wall -Wextra -pedantic -O matmul.f90 -lblas
/usr/bin/ld: warning: libgfortran.so.4, needed by //usr/lib/x86_64-linux-gnu/libopenblas.so.0, may conflict with libgfortran.so.5
ian@eris:~/work/stack$ ./a.out
Max error 3.6379788070917130E-011
38576.055405987529 33186.640731082334 33818.909332709263
38576.055405987536 33186.640731082334 33818.909332709263
ian@eris:~/work/stack$ ./a.out
Max error 2.9103830456733704E-011
34303.739077708480 34227.623080598998 34987.143088270866
34303.739077708473 34227.623080598998 34987.143088270859
ian@eris:~/work/stack$ ./a.out
Max error 3.2741809263825417E-011
35968.603030053979 34778.110740682620 32732.657800858586
35968.603030053971 34778.110740682612 32732.657800858586
ian@eris:~/work/stack$ ./a.out
Max error 2.9103830456733704E-011
31575.076511213174 35879.913361891951 35278.030249048912
31575.076511213178 35879.913361891951 35278.030249048912
我遇到了一个问题,通过 Lapack 在单位矩阵上乘法非空矩阵给我一个空矩阵。所有矩阵都带有正元素
Dimensions of the matrices:
W = M1*N1
D = N1*N1
M = M1*M1
D M -单位矩阵
我尝试做的是获得 D*W'*M 的乘法,其中 W' 是 W 的转置。这是使用 DGEMM 操作的 Fortran 代码
PROGRAM SIRT
DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:,:) :: W, D, M, C, MAIN
DOUBLE PRECISION,ALLOCATABLE,DIMENSION(:) :: b, x
DOUBLE PRECISION :: pho
INTEGER, PARAMETER :: M1 = 27000
INTEGER, PARAMETER :: N1 = 1000
INTEGER, PARAMETER :: num_iterations = 200
INTEGER :: i, j, k
allocate(W(1:M1,1:N1))
allocate(D(1:N1,1:N1))
allocate(M(1:M1,1:M1))
allocate(C(1:N1,1:M1))
allocate(MAIN(1:N1,1:M1))
allocate(b(1:M1))
allocate(x(1:N1))
D = 0
M = 0
DO i=1,N1
D(i,i) = 1
END DO
DO i=1,M1
M(i,i) = 1
END DO
OPEN(UNIT=11, FILE="Wmpi.txt")
DO i = 1,M1
READ(11,*) (W(i,j),j=1,N1)
END DO
print *,ANY(W>0)
CLOSE (11, STATUS='KEEP')
OPEN(UNIT=11, FILE="bmpi.txt")
DO i = 1,M1
READ(11,*) b(i)
END DO
CLOSE (11, STATUS='KEEP')
CALL DGEMM('N', 'T', N1, N1, N1, 1.0, D, N1, W, N1, 0.0, C, N1)
print *,ANY(C>0)
CALL DGEMM('N', 'N', N1, M1, M1, 1.0, C, N1, M, M1, 0.0, MAIN, N1)
print *,ANY(MAIN>0)
pho = DLANGE('F', N1, M1, C, N1, x)
END PROGRAM SIRT
顺序打印的答案是True,True,False。所以第一个乘法有效,我得到的不是空矩阵,但在第二个中所有元素都是 0.
我知道我不需要对单位矩阵进行矩阵乘法,但我想弄清楚如果我这样做会出现什么问题。
另一个问题,我可以在没有临时矩阵 Main 和 C 的情况下提高内存效率吗?
编辑
想出在所有元素都为空的第一次乘法后下载结果矩阵。无法理解为什么 ANY(C>0) 在第二阶段为真。
首先请注意 DGEMM 是 BLAS 库的一部分,而不是 LAPACK - 后者是更高级别的库。
您在调用 DGEMM 时遇到了一些问题
- 真正的常量是有种类的,你必须提供正确的种类。在 特别是 DGEMM 期望在老式 Fortran 中被称为 双精度。您提供的常量是默认类型的实数。 这会导致错误,我强烈建议提供 如果您的程序需要精度,则对所有实常数都友好 不是默认精度
- 在前三个整数中,前两个总是形状 结果矩阵。 C 是 n1xm1,因此更改为第一次调用 DGEMM
- 矩阵的前导维度在 99.999% 的情况下是第一个 尺寸为 allocated/declared - 它与 完全没有数学,这纯粹是为了让 DGEMM 可以计算出矩阵是怎样的 布置在记忆中。在第一个 DGEMM 中 W 是错误的
我还建议,由于使用 MATUL 内在函数可以很容易地正确检查结果,而不是像原来那样半心半意地检查结果,并且避免使用单位矩阵进行测试,除非你真的需要单位矩阵,因为高对称性和大量零点很容易掩盖错误。
将所有这些放在一起,删除我们不相关的部分并将您的程序修改为稍微更现代的形式我得到
Program sirt
Integer, Parameter :: wp = Kind( 1.0d0 )
Real( wp ),Allocatable,Dimension(:,:) :: w, d, m, c, main, main_compare
Real( wp ),Allocatable,Dimension(:) :: b, x
! Change problem size to something manageable on my laptop
!!$ Integer, Parameter :: m1 = 27000
!!$ Integer, Parameter :: n1 = 1000
Integer, Parameter :: m1 = 2700
Integer, Parameter :: n1 = 100
!!$ Integer :: i
Allocate(w(1:m1,1:n1))
Allocate(d(1:n1,1:n1))
Allocate(m(1:m1,1:m1))
Allocate(c(1:n1,1:m1))
Allocate(main(1:n1,1:m1))
Allocate(main_compare(1:n1,1:m1))
Allocate(b(1:m1))
Allocate(x(1:n1))
d = 0.0_wp
m = 0.0_wp
! Don't trust unit matrices for tests, too much symmetry, too many zeros - use Random numbers
!!$ Do i=1,n1
!!$ d(i,i) = 1.0_wp
!!$ End Do
Call Random_number( d )
!!$
!!$ Do i=1,m1
!!$ m(i,i) = 1.0_wp
!!$ End Do
Call Random_number( m )
! Don't have the file - use random numbers
!!$ open(unit=11, file="wmpi.txt")
!!$ do i = 1,m1
!!$ read(11,*) (w(i,j),j=1,n1)
!!$ end do
!!$ print *,any(w>0)
!!$ close (11, status='keep')
Call random_Number( w )
! 1) Real constant have kind, and you must provide the correct kind
! 2) Of the first three constant the first two are always the shape
! of the result matrix. C is n1xm1, hece the change
! 3) The leading dimension of a matrix is 99.999% of the time
! the first dimension as allocated/declared -it has nothing
! to do with the maths at all. It was wrong for W in the
! first matmul
! 4) DGEMM is part of the BLAS library - not LAPACK
!!$ CALL DGEMM('N', 'T', N1, N1, N1, 1.0, D, N1, W, N1, 0.0, C, N1)
!!$ CALL DGEMM('N', 'N', N1, M1, M1, 1.0, C, N1, M, M1, 0.0, MAIN, N1)
Call dgemm('n', 't', n1, m1, n1, 1.0_wp, d, n1, w, m1, 0.0_wp, c, n1)
Call dgemm('n', 'n', n1, m1, m1, 1.0_wp, c, n1, m, m1, 0.0_wp, main, n1)
! 5) Don't do half hearted checks on the results when proper checks are easy
main_compare = Matmul( Matmul( d, Transpose( w ) ), m )
Write( *, * ) 'Max error ', Maxval( Abs( main - main_compare ) )
! Check we haven't somehow managed all zeros in both matrices ...
Write( *, * ) main( 1:3, 1 )
Write( *, * ) main_compare( 1:3, 1 )
End Program sirt
ian@eris:~/work/stack$ gfortran-8 -std=f2008 -fcheck=all -Wall -Wextra -pedantic -O matmul.f90 -lblas
/usr/bin/ld: warning: libgfortran.so.4, needed by //usr/lib/x86_64-linux-gnu/libopenblas.so.0, may conflict with libgfortran.so.5
ian@eris:~/work/stack$ ./a.out
Max error 3.6379788070917130E-011
38576.055405987529 33186.640731082334 33818.909332709263
38576.055405987536 33186.640731082334 33818.909332709263
ian@eris:~/work/stack$ ./a.out
Max error 2.9103830456733704E-011
34303.739077708480 34227.623080598998 34987.143088270866
34303.739077708473 34227.623080598998 34987.143088270859
ian@eris:~/work/stack$ ./a.out
Max error 3.2741809263825417E-011
35968.603030053979 34778.110740682620 32732.657800858586
35968.603030053971 34778.110740682612 32732.657800858586
ian@eris:~/work/stack$ ./a.out
Max error 2.9103830456733704E-011
31575.076511213174 35879.913361891951 35278.030249048912
31575.076511213178 35879.913361891951 35278.030249048912