在 Fortran 中放置数组分配的推荐做法

Question

关于我们应该在何处分配数组的建议或最佳做法是什么？

例如，如果我有一个如图所示的（我的简化版本）程序，我将在主程序中分配输出变量（感兴趣的变量）。该主程序调用子例程 foo，子例程 foo 又调用子例程 foo2，后者进行实际计算。我的问题是 best/recommended 应该在哪里进行分配的做法是什么。

如果foo2进行实际计算，是否应该分配数组？
如果 foo 调用 foo2，应该 foo 分配数组并且 foo2 做只是计算？
我应该写一个新的 function/subroutine 来分配数组吗？
或者最好在主程序上分配并将数组作为假设形状？

如果重要的话，我有一个名为global的模块，它包含主程序上的派生类型，以及代码的主要参数，例如每个数组的大小（Ni，Nj、公差等)

program main
    use global
    implicit none

    type(myVar_) :: ans

    Ni = 10
    Nj = 20

    if (allocated(ans%P)) deallocate(ans%P)
    allocate(ans%P(1:Ni, 1:Nj))

    call foo(ans)

    print *, P
end program main

module global
    integer, parameter :: dp=kind(0.d0)

    integer :: Ni, Nj

    type myVar_
        real(dp), allocatable :: P(:,:)
    end type myVar_

end module global

subroutine foo(myVar)
    use global
    implicit none

    type(myVar_) :: myVar

    call foo2(myVar%P)

end subroutine

subroutine foo2(P)
    use global
    implicit none

    real(dp), intent(inout) :: P(:,:)

    ! do calculations for P
end subroutine foo2

什么是

Answer 1

出于性能原因，避免在低级子例程和函数中进行分配确实是一种很好的做法。从 [1] 中可以看出，简单的加法大约需要 1-3 CPU 个周期，分配和释放对（"small" 数组）可能需要 200-500 CPU 个周期.

我建议您使用 "work" 变量作为输入并可能就地操作（即用结果覆盖输入）编写一个子例程，例如

subroutine do_computation(input,output,work1,work2)
   work1 = ...
   work2 = ...
   output = ...
end subroutine

您可以制作一个包装函数，为方便起见进行分配：

subroutine convenient_subroutine(input,output)
   allocate(work1(...),work2(...)
   call do_computation(input,output,work1,work2)
   deallocate(work1,work2)
end subroutine

当性能不重要时，您可以调用 convenient_subroutine，否则您可以调用 do_computation 尝试在循环迭代之间和不同的其他子例程之间共享工作数组。

[1] http://ithare.com/infographics-operation-costs-in-cpu-clock-cycles/

在 Fortran 中放置数组分配的推荐做法

recommended practices to place allocation of arrays in Fortran

arrays

fortran

allocation