执行 Dijkstra 时是否可以确定跳数？

Question

感谢@trincot 的我可以修改 Dijkstra 以获得给定源节点和目标节点之间的最短路径。此外，我在执行Dijkstra寻找最短路径时尝试计算跳数，当跳数超过预先定义的Max_hop时，Dijkstra将终止，但我失败了。

跳数定义为 (N - 1)，其中 N 是最短路径中包含的顶点数。

当然，找到最短路径后，我们可以很容易地计算出跳数。但是，在Dijkstra的路径搜索过程中，我们可以计算给定源和之间的跳数吗？

from heapq import heappop, heappush
def dijkstra(adjList, source, sink):
    n = len(adjList)   
    parent = [None]*n  
    heap = [(0,source,0)]
    explored_node=[]
    hop_count = 0
    Max_hop = 8    
    while heap:
        distance, current, came_from = heappop(heap)
        if parent[current] is not None:  # skip if already visited
            continue
        parent[current] = came_from  # this also marks the node as visited
        if sink and current == sink:  # only correct place to have terminating condition
            # build path
            path = [current]

            while current != source:
                current = parent[current]
                path.append(current)
            path.reverse()
            hop_count -=1
            print("Hop count is ",hop_count)
            
            return 1, distance, path
        for (neighbor, cost) in adjList[current]:
            if parent[neighbor] is None:  # not yet visited
                heappush(heap, (distance + cost, neighbor,  current))
                hop_count = hop_count + 1
                if hop_count > Max_hop:
                    print("Terminate")
adjList =[

[],

[[2,3],[4,11],[5,5]],
[[1,3],[3,5],[5,11],[6,7]],
[[2,5],[6,3]],
[[1,11],[5,15],[7,9]],
[[1,5],[2,11],[6,3],[7,6],[8,3],[9,9]],
[[2,7],[3,3],[5,3],[9,10]],
[[4,9],[5,6],[8,1],[10,11],[11,8]],
[[5,3],[7,1],[9,9],[11,11]],
[[5,9],[6,10],[8,9],[11,3],[12,8]],
[[7,11],[13,7],[14,3]],
[[7,8],[8,11],[9,3],[12,8],[14,6]],
[[9,8],[11,8],[15,11]],
[[10,7],[15,3]],
[[10,3],[11,6],[15,9]],
[[12,11],[13,3],[14,9]],
]

flag, dist, path = dijkstra(adjList,1,15)

print("found shortest path {}, which has a distance of {}".format(path, dist))

adjList的图形如图所示：（红线为1到15的最短路径）

我知道这是不正确的，因为当 Dijkstra 迭代邻居时，我使 hop_cout + 1 代表探索节点的数量而不是 hop_count。

在我看来，有两个重要问题需要解决。

当aparent_node和aneighbor_node之间的最短距离确定后，hop_count可以加1。但是，Dijkstra通过迭代邻居节点找到最短路径，存储最短距离的数组在路径搜索过程中逐渐更新。 How to determine Dijkstra has already found the shortest distance between a parent_node and a neighbor_node?
只有条件1是不够的，即使我们可以知道Dijkstra何时找到了两个节点之间的最短距离，但是我们如何知道neighbor_node是否会包含在给定源之间的最短路径中和目的地？

综上所述，如果我们想知道Dijkstra期间的当前跳数是运行，我们需要设置hop_count +1，当最短路径从parent_node neighbor_node已经确定，neighbor_node将被包含到从源节点到目的节点的最短路径。

为了更好的定义问题，如图所示，红线是node 1和node 15之间的最短路径，最短路径是1 ->5 ->8 ->7 ->10 ->13 ->15.

探索node 2时和node 1之间的最短距离 node 2确定为3，hop_count不能加1，因为 node 2不包含在1到15之间的最短路径中
探索node 5时和node 1之间的最短距离 node 5确定为5，hop_count应该加上1，因为 node 5包含在1到15之间的最短路径中。

我的理解对吗？我可以听听你的想法“执行 Dijkstra 时是否可以确定跳数？”

Answer 1

此代码使用 dijkstra 算法的优先级队列。

#include <iostream>
#include <algorithm>
#include <queue>
#include <cstring>
#include <cstdio>
#include <vector>
#define limit 15

using namespace std;

int cost[20001];
vector<int> plist[20001];

const int MaxVal = -1;

vector< vector< pair<int, int> > > arr;

struct node {
    pair<int, int> info;
    vector<int> path;
};

bool operator < (node a, node b) {
    return a.info.first > b.info.first;
}

int main() {
    int i, j, k;
    int n, m;
    int s;
    int a, b, c;
    cin >> n >> m;
    cin >> s;
    //arr.reserve(n + 1);
    arr.resize(n + 1);

    for (i = 1; i <= m; i++) {
        cin >> a >> b >> c;
        arr[a].push_back({ b, c });
    }

    for (i = 1; i <= n; i++) {
        cost[i] = MaxVal;
    }

    priority_queue<node, vector<node>> mh;
    mh.push(node{ { 0, s }, { } });
    while (mh.size() > 0) {
        int current = mh.top().info.second;
        int val = mh.top().info.first;
        auto path = mh.top().path;
        mh.pop();
        if (cost[current] != MaxVal) continue;//All path would be sorted in prioirty queue. And the path that got out late can't be the shorter path.
        cost[current] = val;
        path.push_back(current);
            if(path.size() > limit) {
                //limit exceeded!! 
                cout << "limitation exceeded!!";
                break;
            }
        plist[current] = path;
        for (auto it : arr[current]) {
            if (cost[it.first] != MaxVal) continue;
            mh.push({ { it.second + val, it.first }, path });
        }
    }

    for (i = 1; i <= n; i++) {
        cout << "path to " << i << " costs ";
        if (cost[i] == MaxVal) {
            cout << "INF\n";
        }
        else {
            cout << cost[i] << "\n";
        }
        for (auto p : plist[i]) {
            cout << p << " ";
        }
        cout << endl << endl;
    }   

    return 0;
}

//test case
15 55
1 //Starting Node Number
1 2 3
1 4 11
1 5 5
2 1 3
2 3 5
2 5 11
2 6 7
3 2 5
3 6 3
4 1 11
4 5 15
4 7 9
5 1 5
5 2 11
5 6 3
5 7 6
5 8 3
5 9 9
6 2 7
6 3 3
6 5 3
6 9 10
7 4 9
7 5 6
7 8 1
7 10 11
7 11 8
8 5 3
8 7 1
8 9 9
8 11 11
9 5 9
9 6 10
9 8 9
9 11 3
9 12 8
10 7 11
10 13 7
10 14 3
11 7 8
11 8 11
11 9 3
11 12 8
11 14 6
12 9 8
12 11 8
12 15 11
13 10 7
13 15 3
14 10 3
14 11 6
14 15 9
15 12 11
15 13 3
15 14 9

path to 1 costs 0
1

path to 2 costs 3
1 2

path to 3 costs 8
1 2 3

path to 4 costs 11
1 4

path to 5 costs 5
1 5

path to 6 costs 8
1 5 6

path to 7 costs 9
1 5 8 7

path to 8 costs 8
1 5 8

path to 9 costs 14
1 5 9

path to 10 costs 20
1 5 8 7 10

path to 11 costs 17
1 5 8 7 11

path to 12 costs 22
1 5 9 12

path to 13 costs 27
1 5 8 7 10 13

path to 14 costs 23
1 5 8 7 11 14

path to 15 costs 30
1 5 8 7 10 13 15

Answer 2

由于堆将具有代表不同长度路径的节点，因此您不能希望使用一个变量来计算跳数。您需要将跳数作为附加信息添加到您放在堆上的元组中，因为它特定于每个单独的路径。

注意：我也会将 max_hop 作为函数的参数：

from heapq import heappop, heappush

def dijkstra(adjList, source, sink, max_hop=8):  # make max_hop a parameter
    n = len(adjList)   
    parent = [None]*n  
    heap = [(0, source, 0, 0)]  # added hop_count as 4th value
    hop_count = 0
    while heap:
        distance, current, came_from, hop_count = heappop(heap)  # get hop_count also
        if parent[current] is not None:
            continue
        parent[current] = came_from
        if sink and current == sink:
            path = [current]
            while current != source:
                current = parent[current]
                path.append(current)
            path.reverse()
            print("Hop count is ", hop_count)
            return 1, distance, path
        
        if hop_count >= max_hop:  # no recursion beyond max_hop
            print("Terminate")
            continue
        for (neighbor, cost) in adjList[current]:
            if parent[neighbor] is None:
                # increase hop_count on this particular path
                heappush(heap, (distance + cost, neighbor,  current, hop_count + 1))

关于你的另一个问题：

How to determine Dijkstra has already found the shortest distance between a parent_node and a neighbor_node?

这是 for 循环中的 if 检测到的内容：如果该节点已被访问，这意味着它已在堆上获得优先级并且已在较早的迭代中从中拉出主 while 循环，因此我们已经有了到该节点的最短路径。这 if 阻止我们在堆上推送无用的“替代”路径。

Answer 3

这里有两个问题，一个是如何跟踪路径的长度，另一个是一旦超过最大路径长度就终止程序。两者的答案截然不同。

一方面，您可以通过在算法完成后获取路径的长度来计算最短路径的跳数（尽管这似乎不是您想要的）。其次，您还可以跟踪在任意迭代中从源到任何给定节点 X 需要多少跳，只需跟踪从 s 到顶点 X 的当前路径的长度并更新 path-length 在放松步骤的邻居。 @trincot answer 也提供了代码，这在很大程度上涵盖了这一点。

现在，在进入程序终止部分之前，让我陈述三个通过 Dijkstra 算法不变的有用引理。

Lemma 1: For every marked vertex, the distance from source to that vertex is a shortest path.
Lemma 2: For every unmarked vertex, the current recorded distance is a shortest path considering only the already visited vertices.
Lemma 3: If the shortest is s -> ... -> u -> v then, when u is visited and it's neighbor's distance updated the distance d(s, v) will remain invariant.

这些引理告诉我们的是：

当节点 X 被标记为已访问时：d(s, x) 最小并且路径 s->x 的长度将保持不变（来自引理 1）
直到节点 X 被标记为已访问 d(s, x) 是一个估计值，路径 s->x 的长度是当前路径长度的任何值。两个值都可能改变。（来自引理 2）
您不能保证长度为 N 的路径是最短路径，也不能保证最短路径的长度 <= N（来自引理 3，稍作修改）

因此，如果您决定在从源到接收器的 path-length 大于最大跳数时终止程序，则无法保证获得的信息是最佳的。特别是，任何这些都可能在程序终止时发生：

路径长度为N，但还有另一条长度为N且距离较短的路径。
路径长度为N，还有一条更短距离更短的路径。

如果你想获得从源到汇的最短路径，同时限制路径长度，你应该使用 Bellman-Ford 算法，它保证在每次迭代时 i 所有路径最多有 i 边的长度，并且这条路径在该约束下是最短的。

执行 Dijkstra 时是否可以确定跳数？

Is it possible to determine the hop-count when performing Dijkstra?

algorithm

graph

dijkstra

shortest-path