OpenMP PARALLEL DO 中的子例程 - 程序崩溃
Subroutine inside OpenMP PARALLEL DO - Program Crash
这两本我都看过了 Calling an internal subroutine inside OpenMP region and Global Variables in Fortran OpenMP. My understanding (from here) 是:
- 参数列表中的变量从调用例程继承它们的数据范围属性。
- Fortran 中的 COMMON 块或模块变量是共享的,除非声明为 THREADPRIVATE。
- Fortran 中的 SAVE 变量是共享的。
- 所有其他局部变量都是私有的。
以下是我的代码的简化版本:
!$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE(j,dummy1,dummy2,dummy3,dummy4)
DO j=1,ntotal
dummy1 = 0.0d0
dummy2 = foo(j)
CALL kernel(dummy1,dummy1,dummy2,dummy3,dummy4)
Variable(j) = dummy3 + dummy4
END DO
!$OMP END PARALLEL DO
子程序内核然后取入dummy1和dummy2并输出OUT dummy3和dummy4。我编译:
-fopenmp -fno-automatic -fcheck=all
我得到:
Fortran runtime error: Recursive call to nonrecursive procedure 'kernel'
根据 here 我的理解是预期的。当我在没有 -fcheck 的情况下进行编译时,有时代码会顺利通过子例程调用,但大多数时候它会崩溃而不会出现错误。我猜这是因为我的子例程不是线程安全的。传递给子例程的所有参数对于每个线程都应该是私有的和独立的。修剪后的子程序如下:
SUBROUTINE kernel(r,dx,hsml,w,dwdx)
USE Initial_Parameters
IMPLICIT NONE
! DATA DICTIONARY: DECLARE CALLING PARAMETER TYPES AND DEFINITIONS
REAL(KIND=dp), INTENT(IN) :: r
REAL(KIND=dp), DIMENSION(dim), INTENT(IN) :: dx
REAL(KIND=dp), INTENT(IN) :: hsml
REAL(KIND=dp), INTENT(OUT) :: w
REAL(KIND=dp), DIMENSION(dim), INTENT(OUT):: dwdx
! DATA DICTIONARY: DECLARE LOCAL VARIABLE TYPES AND DEFINITIONS
INTEGER :: i, d
REAL(KIND=dp) :: q, dw
REAL(KIND=dp) :: factor
! Kernel functions are funcitons of q, the distance between particles
! divided by the smoothing length
q = r/hsml
! Preset the kernel to zero
w = 0.e0
! Preset the derivative of the kernel to zero
DO d=1,dim
dwdx(d) = 0.e0
END DO
IF (skf == 1) THEN
! If the problem is one dimensional then,
IF (dim == 1) THEN
! The coefficient, alpha = factor is given by:
factor = 1.e0/hsml
! If the problem is two dimensional then,
ELSE IF (dim == 2) THEN
! The coefficient, alpha = factor is given by:
factor = 15.e0/(7.e0*pi*hsml*hsml)
! If the problem is two dimensional then,
ELSE IF (dim == 3) THEN
! The coefficient, alpha = factor is given by:
factor = 3.e0/(2.e0*pi*hsml*hsml*hsml)
! If the dimension value is not 1, 2 or 3 then there is a problem.
ELSE
WRITE(*,*)' >>> Error <<< : Wrong dimension: Dim =',dim
STOP
END IF
! Smoothing function for 1st range of q.
IF (q >= 0 .AND. q <= 1.e0) THEN
! The smoothing function is given by:
w = factor * (2./3. - q*q + q*q*q / 2.)
! For each dimension work out the gradient of the smoothing function
DO d = 1, dim
dwdx(d) = factor * (-2.+3./2.*q)/hsml**2 * dx(d)
END DO
! Smoothing function for 2nd range of q.
ELSE IF (q > 1.e0 .AND. q <= 2) THEN
! Smoothing function is equal to:
w = factor * 1.e0/6.e0 * (2.-q)**3
! Gadient of the smoothing function in each dimension.
DO d = 1, dim
dwdx(d) =-factor * 1.e0/6.e0 * 3.*(2.-q)**2/hsml * (dx(d)/r)
END DO
! Smoothing function and gradient for all other values of q is zero.
ELSE
! Smoothing function is equal to:
w=0.
! Gadient of the smoothing function in each dimension.
DO d= 1, dim
dwdx(d) = 0.
END DO
END IF
END SUBROUTINE kernel
局部变量应该是私有的,所有传递的参数都是私有的。模块参数是共享的,但这很好。你能解释一下为什么会崩溃吗?
对于 -fno-automatic
,kernel
中的局部变量将隐式为 SAVE
d。描述here 注释
Local variables with the SAVE attribute declared in procedures called from a parallel region are implicitly SHARED.
因此,kernel
确实不是线程安全的(据我所知)。
另请注意,在您的示例中,您将 dummy1
作为第一个和第二个参数传递给 kernel
,但是您对该例程的定义指定了标量中的第一个参数 (r
)而第二个 (dx
) 是一个长度为 dim
的数组。我不确定这是否只是您的最小示例或您的真实代码的产物,但这可能会导致问题。您是在模块内声明 kernel
然后使用该模块吗?这将生成应该有助于捕捉此类事物的接口。
这两本我都看过了 Calling an internal subroutine inside OpenMP region and Global Variables in Fortran OpenMP. My understanding (from here) 是:
- 参数列表中的变量从调用例程继承它们的数据范围属性。
- Fortran 中的 COMMON 块或模块变量是共享的,除非声明为 THREADPRIVATE。
- Fortran 中的 SAVE 变量是共享的。
- 所有其他局部变量都是私有的。
以下是我的代码的简化版本:
!$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE(j,dummy1,dummy2,dummy3,dummy4)
DO j=1,ntotal
dummy1 = 0.0d0
dummy2 = foo(j)
CALL kernel(dummy1,dummy1,dummy2,dummy3,dummy4)
Variable(j) = dummy3 + dummy4
END DO
!$OMP END PARALLEL DO
子程序内核然后取入dummy1和dummy2并输出OUT dummy3和dummy4。我编译:
-fopenmp -fno-automatic -fcheck=all
我得到:
Fortran runtime error: Recursive call to nonrecursive procedure 'kernel'
根据 here 我的理解是预期的。当我在没有 -fcheck 的情况下进行编译时,有时代码会顺利通过子例程调用,但大多数时候它会崩溃而不会出现错误。我猜这是因为我的子例程不是线程安全的。传递给子例程的所有参数对于每个线程都应该是私有的和独立的。修剪后的子程序如下:
SUBROUTINE kernel(r,dx,hsml,w,dwdx)
USE Initial_Parameters
IMPLICIT NONE
! DATA DICTIONARY: DECLARE CALLING PARAMETER TYPES AND DEFINITIONS
REAL(KIND=dp), INTENT(IN) :: r
REAL(KIND=dp), DIMENSION(dim), INTENT(IN) :: dx
REAL(KIND=dp), INTENT(IN) :: hsml
REAL(KIND=dp), INTENT(OUT) :: w
REAL(KIND=dp), DIMENSION(dim), INTENT(OUT):: dwdx
! DATA DICTIONARY: DECLARE LOCAL VARIABLE TYPES AND DEFINITIONS
INTEGER :: i, d
REAL(KIND=dp) :: q, dw
REAL(KIND=dp) :: factor
! Kernel functions are funcitons of q, the distance between particles
! divided by the smoothing length
q = r/hsml
! Preset the kernel to zero
w = 0.e0
! Preset the derivative of the kernel to zero
DO d=1,dim
dwdx(d) = 0.e0
END DO
IF (skf == 1) THEN
! If the problem is one dimensional then,
IF (dim == 1) THEN
! The coefficient, alpha = factor is given by:
factor = 1.e0/hsml
! If the problem is two dimensional then,
ELSE IF (dim == 2) THEN
! The coefficient, alpha = factor is given by:
factor = 15.e0/(7.e0*pi*hsml*hsml)
! If the problem is two dimensional then,
ELSE IF (dim == 3) THEN
! The coefficient, alpha = factor is given by:
factor = 3.e0/(2.e0*pi*hsml*hsml*hsml)
! If the dimension value is not 1, 2 or 3 then there is a problem.
ELSE
WRITE(*,*)' >>> Error <<< : Wrong dimension: Dim =',dim
STOP
END IF
! Smoothing function for 1st range of q.
IF (q >= 0 .AND. q <= 1.e0) THEN
! The smoothing function is given by:
w = factor * (2./3. - q*q + q*q*q / 2.)
! For each dimension work out the gradient of the smoothing function
DO d = 1, dim
dwdx(d) = factor * (-2.+3./2.*q)/hsml**2 * dx(d)
END DO
! Smoothing function for 2nd range of q.
ELSE IF (q > 1.e0 .AND. q <= 2) THEN
! Smoothing function is equal to:
w = factor * 1.e0/6.e0 * (2.-q)**3
! Gadient of the smoothing function in each dimension.
DO d = 1, dim
dwdx(d) =-factor * 1.e0/6.e0 * 3.*(2.-q)**2/hsml * (dx(d)/r)
END DO
! Smoothing function and gradient for all other values of q is zero.
ELSE
! Smoothing function is equal to:
w=0.
! Gadient of the smoothing function in each dimension.
DO d= 1, dim
dwdx(d) = 0.
END DO
END IF
END SUBROUTINE kernel
局部变量应该是私有的,所有传递的参数都是私有的。模块参数是共享的,但这很好。你能解释一下为什么会崩溃吗?
对于 -fno-automatic
,kernel
中的局部变量将隐式为 SAVE
d。描述here 注释
Local variables with the SAVE attribute declared in procedures called from a parallel region are implicitly SHARED.
因此,kernel
确实不是线程安全的(据我所知)。
另请注意,在您的示例中,您将 dummy1
作为第一个和第二个参数传递给 kernel
,但是您对该例程的定义指定了标量中的第一个参数 (r
)而第二个 (dx
) 是一个长度为 dim
的数组。我不确定这是否只是您的最小示例或您的真实代码的产物,但这可能会导致问题。您是在模块内声明 kernel
然后使用该模块吗?这将生成应该有助于捕捉此类事物的接口。