在图上进行深度优先 run 时访问节点两次

Visiting node twice while doing a depth-first run on a graph

出于某种原因,running 这个简短的脚本访问了节点 B 两次(实际节点被访问了两次)。一件非常奇怪的事情是,如果您查看两个 id(tovisit) 调用,它们在函数 (!) 的最后 run 中会有所不同。谁能发现问题?

mygraph = {
    'A': {'C', 'B'},
    'C': {'A', 'F'},
    'B': {'D', 'E'},
    'D': {'B'},
    'E': {'B', 'F'},
    'F': {'C', 'E'},
}

visited = set()
def dfs(graph, start):
    print("visiting %s" % start)
    visited.add(start)
    connections = graph[start]
    tovisit = (connections - visited)
    print(id(tovisit))
    for each in tovisit:
        print(id(tovisit))
        dfs(graph, each)

dfs(mygraph, 'A')

~$ python dfs.py
visiting A
visiting C
visiting F
visiting E
visiting B
visiting D
visiting B

问题是您正在缓存要访问的集合中未访问的邻居。但是其中有些可能会被深度优先搜索访问到,所以可能会被访问​​两次。

尝试添加额外检查:

for each in tovisit:
        if each in visited:  # Add this line
            continue         # Add this line
        print(id(tovisit))
        dfs(graph, each)

或者,在识别所有邻居后立即将其标记为已访问:

 tovisit = (connections - visited)
 visited.update(tovisit) # Add this line

在当前代码中发生的是:

Visits A, notes C and B as unvisited neighbours
  Visits C, notes F as unvisited
    Visits F, notes E as unvisited
      Visits E notes B as unvisited
        Visits B, notes D as unvisited
          Visits D
Visits B again!

另一种方法是将整个逻辑围绕访问检查进行包装。

def dfs(graph, start):
    if start not in visited: #CHECK ADDED BEFORE LOGIC
        print("visiting %s" % start)
        visited.add(start)
        connections = graph[start]
        tovisit = (connections - visited)
        print(id(tovisit))
        for each in tovisit:
            print(id(tovisit))
            dfs(graph, each)

另外:python 中的设置不遵循您添加节点的顺序。如果你想要访问节点的确切顺序,你可以使用一个列表,但是访问的成员资格检查从 O(1) 到 O(N)。最好的选择是使用有序字典访问。