Python 深度优先搜索(包括循环)

Depth First Search in Python (including cycles)

所以我在 Whosebug 中看到以下 post 关于 Python 中的 DFS 算法(非常有帮助):

Does this python code employs Depth First Search (DFS) for finding all paths?

我还有一个需要分析的图表(以找到两个节点之间的每条可能路径),但我还需要在其中包括循环。例如,如果我有这样的图表:

graph = {'Start': ['1'],
             '1': ['2'],
             '2': ['3','End'],
             '3': ['2','End']}

我想要以下输出:

Start, 1, 2, 3, End
Start, 1, 2, End
Start, 1, 2, 3, 2, End
Start, 1, 2, 3, 2, 3, End

是否有任何可能的方法来更改以下代码以执行此操作?

def find_all_paths(graph, start, end, path=[]):
        path = path + [start]
        if start == end:
            return [path]
        if not graph.has_key(start):
            return []
        paths = []
        for node in graph[start]:
            if node not in path:
                paths += find_all_paths(graph, node, end, path)
        return paths

print find_all_paths(graph, 'Start', 'End')

如评论中所述,如果您允许任意循环,您的算法没有理由终止。您可以做的是允许最大路径长度并忽略所有太长的路径:

def find_all_paths(graph, start, end, path=[], max_length=10):
    if len(path) >= max_length:
        return []
    path = path + [start]
    if start == end:
        return [path]
    if start not in graph:
        return []
    paths = []
    for node in graph[start]:
        paths += find_all_paths(graph, node, end, path, max_length=max_length)
    return paths

graph = {'Start': ['1'],
         '1': ['2'],
         '2': ['3','End'],
         '3': ['2','End']}

print find_all_paths(graph, 'Start', 'End', max_length=10)

你的算法输出:

[['Start', '1', '2', '3', '2', '3', '2', '3', '2', 'End'], ['Start', '1', '2', '3', '2', '3', '2', '3', 'End'], ['Start', '1', '2', '3', '2', '3', '2', 'End'], ['Start', '1', '2', '3', '2', '3', 'End'], ['Start', '1', '2', '3', '2', 'End'], ['Start', '1', '2', '3', 'End'], ['Start', '1', '2', 'End']]

编辑:

用更 pythonic start not in graph

替换了 not graph.has_key

这不是你想用简单的 Depth-First Search (DFS) 做的事情。

DFS,深度优先,遇到循环就会阻塞。 循环无限深

如果你想在包含循环时输出(可能使用生成器)到每个节点的无限路径,你应该使用 Breadth-First Search (BFS)。广度优先,意味着循环不会阻止它到达其他路径。

这里的取舍是广度优先搜索会消耗更多内存,从而在 运行 期间保持更多列表处于活动状态。如果这是一个问题,你应该使用 DFS's iterative deepening solution.

简单的 BFS 解决方案:

使用Queue(BFS 很容易使用队列实现,DF 使用堆栈很容易实现,如您所见here):

#!/usr/bin/env python

import Queue

graph = {'Start': ['1'],
             '1': ['2'],
             '2': ['3','End'],
             '3': ['2','End']}

expand_queue = Queue.Queue()

def BFS_generator(graph, start, end, path):
    # initialize generator
    expand_queue.put((graph, start, end, path))

    while not expand_queue.empty():
        graph, current, end, path = expand_queue.get()

        if current == end:
            # route done - yield result
            yield path + [current]

        if current in graph:
            # skip neighbor-less nodes
            for neighbor in graph[current]:
                # put all neighbors last in line to expand
                expand_queue.put((graph, neighbor, end, path + [current]))

gen = BFS_generator(graph, "Start", "End", [])

# get only 10 first paths
for _ in xrange(10):
    print next(gen)

输出:

['Start', '1', '2', 'End']
['Start', '1', '2', '3', 'End']
['Start', '1', '2', '3', '2', 'End']
['Start', '1', '2', '3', '2', '3', 'End']
['Start', '1', '2', '3', '2', '3', '2', 'End']
['Start', '1', '2', '3', '2', '3', '2', '3', 'End']
['Start', '1', '2', '3', '2', '3', '2', '3', '2', 'End']
['Start', '1', '2', '3', '2', '3', '2', '3', '2', '3', 'End']
['Start', '1', '2', '3', '2', '3', '2', '3', '2', '3', '2', 'End']
['Start', '1', '2', '3', '2', '3', '2', '3', '2', '3', '2', '3', 'End']

将解决方案推广到 DFS 和 BFS,迭代加深:

您还可以使用更通用的代码,使用 QueueStack,然后使用 BFS(队列)或 DFS(堆栈)轻松获得迭代解决方案。看你需要什么。

所以首先创建一个 Stack class(为了接口目的,list 有我们需要的一切):

class Stack():

    def __init__(self):
        self.stack = []

    def get(self):
        return self.stack.pop(0)

    def put(self, item):
        self.stack.insert(0, item)

    def empty(self):
        return len(self.stack) == 0

既然您同时拥有 Queue 和堆栈,那么如果我们更改所使用的数据结构,算法的简单概括,使其具有迭代性和数据结构不可知性,将为我们提供两种解决方案:

def iterative_search(data_structure, graph, start, end, limit=None):
    # initialize generator
    data_structure.put((graph, start, end,[]))

    while not data_structure.empty():
        graph, current, end, path = data_structure.get()

        if limit and len(path) > limit:
            continue

        if current == end:
            # route done - yield result
            yield tuple(path + [current])

        if current in graph:
            # skip neighbor-less nodes
            for neighbor in graph[current]:
                # store all neighbors according to data structure
                data_structure.put(
                    (graph, neighbor, end, path + [current])
                )

综合来看:

我们可以看到他们 select 不同的路线(我稍微改变了 graph 以使其更有趣):

import Queue

class Stack():

    def __init__(self):
        self.stack = []

    def get(self):
        return self.stack.pop(0)

    def put(self, item):
        self.stack.insert(0, item)

    def empty(self):
        return len(self.stack) == 0

graph = {'Start': ['1'],
             '1': ['2'],
             '2': ['3','End'],
             '3': ['2', '4','End'],
             '4': ['3']}


def iterative_search(data_structure, graph, start, end, limit=None):
    # initialize generator
    data_structure.put((graph, start, end,[]))

    while not data_structure.empty():
        graph, current, end, path = data_structure.get()

        # make solution depth limited
        # makes it iterative - for DFS to use all paths
        if limit and len(path) > limit:
            continue

        if current == end:
            # route done - yield result
            yield tuple(path + [current])

        if current in graph:
            # skip neighbor-less nodes
            for neighbor in graph[current]:
                # store all neighbors according to data structure
                data_structure.put(
                    (graph, neighbor, end, path + [current])
                )

现在我们有了迭代 DFS 和 BFS 的泛化函数,我们可以比较它们提供的解决方案:

import os

# bfs - using queue
gen = iterative_search(Queue.Queue(), graph, "Start", "End")
print "BFS"
# get only 10 first paths
bfs_path_set = set()
while len(bfs_path_set) < 10:
    bfs_path_set.add(next(gen))

print os.linesep.join(map(str, bfs_path_set))

print "Iterative DFS"
# dfs  - using stack
gen = iterative_search(Stack(), graph, "Start", "End", limit=5)

# get only 10 first paths
dfs_path_set = set()
limit = 1
while len(dfs_path_set) < 10:
    try:
        dfs_path_set.add(next(gen))
    except StopIteration:
        limit += 1
        print "depth limit reached, increasing depth limit to %d" % limit
        gen = iterative_search(
            Stack(), graph, "Start", "End", limit=limit
        )

print os.linesep.join(map(str, dfs_path_set))

print "difference BFS - DFS: %s" % str(bfs_path_set - dfs_path_set)
print "difference DFS - BFS: %s" % str(dfs_path_set - bfs_path_set)

输出:

BFS
('Start', '1', '2', '3', '2', '3', '4', '3', 'End')
('Start', '1', '2', '3', '2', '3', '2', '3', 'End')
('Start', '1', '2', '3', '2', '3', '2', 'End')
('Start', '1', '2', '3', '4', '3', '2', 'End')
('Start', '1', '2', '3', '4', '3', 'End')
('Start', '1', '2', '3', 'End')
('Start', '1', '2', '3', '4', '3', '2', '3', 'End')
('Start', '1', '2', '3', '2', '3', 'End')
('Start', '1', '2', '3', '2', 'End')
('Start', '1', '2', 'End')
Iterative DFS
limit reached, increasing limit to 2
limit reached, increasing limit to 3
limit reached, increasing limit to 4
limit reached, increasing limit to 5
limit reached, increasing limit to 6
limit reached, increasing limit to 7
limit reached, increasing limit to 8
('Start', '1', '2', '3', '2', '3', '4', '3', 'End')
('Start', '1', '2', '3', '4', '3', '4', '3', 'End')
('Start', '1', '2', '3', '2', '3', '2', 'End')
('Start', '1', '2', '3', '4', '3', '2', 'End')
('Start', '1', '2', '3', '4', '3', 'End')
('Start', '1', '2', '3', 'End')
('Start', '1', '2', '3', '4', '3', '2', '3', 'End')
('Start', '1', '2', '3', '2', '3', 'End')
('Start', '1', '2', '3', '2', 'End')
('Start', '1', '2', 'End')
difference BFS - DFS: set([('Start', '1', '2', '3', '2', '3', '2', '3', 'End')])
difference DFS - BFS: set([('Start', '1', '2', '3', '4', '3', '4', '3', 'End')])

备注:

解决方案的差异: 您可以看到解决方案的路径有所不同 selection。您可以通过将解决方案的长度设置为 11 而不是 10 来验证它们是否最终获得了所有路径(这将使两个解决方案集相同 - 因为它们将消耗所有 9 长度的解决方案)。

内存消耗: 重要的是要注意,这种 DFS 实现并不是最佳的,因为它确实存储了节点的所有邻居。它通常应该比 BFS 更好,后者在消耗它们之前存储更多的邻居,但是应该存在使用 backtracking 和递归的更优化的解决方案。