Fortran 广度优先搜索运行慢

Question

背景： 我编写了一个 Fortran 子例程，它根据未加权的有向网络的邻接矩阵计算最短路径长度矩阵。该子例程使用广度优先搜索算法。最短路径长度矩阵的对角元素为最短环路长度。

我也在MATLAB中实现了这个算法。我正在迁移到 Fortran，希望减少运行次（请注意，我是 Fortran 的新手）。然而，这个子例程运行在 MATLAB 中比在 Fortran 实现中稍微快一点（对于相同的 ~1000 节点网络，在 MATLAB 中大约 44 秒，在 Fortran 中大约 46 秒）。这对我来说没有意义，因为我的理解是 Fortran 对于基于循环的操作应该明显更快。我的一些其他子例程在 Fortran 中快 1-2 个数量级。

我在 OSX 10.10.2 上编译，安装了最新的 gfortran 二进制文件，没有优化标志。（打开优化标志实际上似乎进一步降低了代码速度。）

问题： 谁能看到我的 fortan 代码中可能导致运行效率低下的缺陷？（对于 Fortran 新手的任何其他一般提示也将不胜感激）。或者，这个任务有更快的算法吗？

代码：

subroutine spl(a,splmat)
implicit none

! Input: Adjacency matrix a
logical, intent(in) :: a(:,:)

! Output: Shortest path length matrix
integer, dimension (:,:), allocatable :: splmat

integer, dimension (:), allocatable :: stk

! Variables: nnodes (size of network); s (source node); r (read ptr);
! w (write pointer) ; d (distance from s to node n), n (the node
! pointed to by the value of the stack at r); j (loop variable);
integer :: nnodes,s,r,w,d,n,j

nnodes = size(a,1)

allocate(splmat(nnodes,nnodes))
splmat = 0

allocate (stk(nnodes))

! Outer loop over each node
do s = 1,nnodes
    stk = 0
    stk(1) = s
    r = 1
    w = 2

    ! Inner loop
    do while (r/=w)
        n = stk(r)
        r = r+1

        d = splmat(s,n)

        do j=1,nnodes
            if (a(n,j).and.(splmat(s,j)==0)) then
                splmat(s,j) = d+1
                stk(w) = j
                w = w+1
            end if
        end do
    end do
end do
end subroutine spl

Answer 1

我不知道您使用的是什么算法，但我立即注意到一个非常简单的问题，即您的索引顺序错误，无法在 Fortran 中获得最佳性能。

我的意思是这通常不好：

do i = 1, m
    do j = 1, n
        ! Do something with a(i,j)
    end do
end do

而这要好得多：

do j = 1, n
    do i = 1, m
        ! Do something with a(i,j)
    end do
end do

对于小问题通常没有太大区别（除非在某些情况下您严重依赖 SIMD 向量化来提高性能）。然而，对于更大的问题，缓存使用效率可能会有很大差异。

Fortran 广度优先搜索运行慢

Fortran breadth first search running slowly

performance

fortran

breadth-first-search

shortest-path

Fortran 广度优先搜索 运行 慢

Fortran breadth first search running slowly

performance

fortran

breadth-first-search

shortest-path

Fortran 广度优先搜索运行慢