如何干净利落地避免递归函数中的循环(广度优先遍历)
How to cleanly avoid loops in recursive function (breadth-first traversal)
我正在编写网络的递归广度优先遍历。我 运行 遇到的问题是网络通常是这样的:
1
/ \
2 3
\ /
4
|
5
所以我的遍历从1开始,然后遍历到2,然后是3。下一站是进行到4,所以2遍历到4。这之后,3遍历到4,突然我在重复工作因为两条线都试图遍历到 5.
我找到的解决方案是创建一个名为 self.already_traversed
的列表,每次遍历一个节点时,我都会将其附加到列表中。然后,当我从节点 4 遍历时,我检查以确保它尚未被遍历。
这里的问题是我为此使用了一个实例变量,所以我需要一种在第一次递归之前设置列表并在之后清理它的方法。我目前这样做的方式是:
self.already_traversed = []
self._traverse_all_nodes(start_id)
self.already_traversed = []
当然,在使用它们的函数之外使用两个变量很糟糕。有没有更好的方法可以将其放入我的遍历函数中?
这是实际的代码,虽然我知道它有点密集:
def _traverse_all_nodes(self, root_authority, max_depth=6):
"""Recursively build a networkx graph
Process is:
- Work backwards through the authorities for self.cluster_end and all
of its children.
- For each authority, add it to a networkx graph, if:
- it happened after self.cluster_start
- it's in the Supreme Court
- we haven't exceeded a max_depth of six cases.
- we haven't already followed this path
"""
g = networkx.Graph()
if hasattr(self, 'already_traversed'):
is_already_traversed = (root_authority.pk in self.visited_nodes)
else:
# First run. Create an empty list.
self.already_traversed = []
is_already_traversed = False
is_past_max_depth = (max_depth <= 0)
is_cluster_start_obj = (root_authority == self.cluster_start)
blocking_conditions = [
is_past_max_depth,
is_cluster_start_obj,
is_already_traversed,
]
if not any(blocking_conditions):
print " No blocking conditions. Pressing on."
self.visited_nodes.append(root_authority.pk)
for authority in root_authority.authorities.filter(
docket__court='scotus',
date_filed__gte=self.cluster_start.date_filed):
g.add_edge(root_authority.pk, authority.pk)
# Combine our present graph with the result of the next
# recursion
g = networkx.compose(g, self._build_graph(
authority,
max_depth - 1,
))
return g
def add_clusters(self):
"""Do the network analysis to add clusters to the model.
Process is to:
- Build a networkx graph
- For all nodes in the graph, add them to self.clusters
"""
self.already_traversed = []
g = self._traverse_all_nodes(
self.cluster_end,
max_depth=6,
)
self.already_traversed = []
查看:
How do I pass a variable by reference?
其中包含有关如何通过引用传递列表的示例。如果您通过引用传递列表,则每次调用您的函数都将引用同一个列表。
我正在编写网络的递归广度优先遍历。我 运行 遇到的问题是网络通常是这样的:
1
/ \
2 3
\ /
4
|
5
所以我的遍历从1开始,然后遍历到2,然后是3。下一站是进行到4,所以2遍历到4。这之后,3遍历到4,突然我在重复工作因为两条线都试图遍历到 5.
我找到的解决方案是创建一个名为 self.already_traversed
的列表,每次遍历一个节点时,我都会将其附加到列表中。然后,当我从节点 4 遍历时,我检查以确保它尚未被遍历。
这里的问题是我为此使用了一个实例变量,所以我需要一种在第一次递归之前设置列表并在之后清理它的方法。我目前这样做的方式是:
self.already_traversed = []
self._traverse_all_nodes(start_id)
self.already_traversed = []
当然,在使用它们的函数之外使用两个变量很糟糕。有没有更好的方法可以将其放入我的遍历函数中?
这是实际的代码,虽然我知道它有点密集:
def _traverse_all_nodes(self, root_authority, max_depth=6):
"""Recursively build a networkx graph
Process is:
- Work backwards through the authorities for self.cluster_end and all
of its children.
- For each authority, add it to a networkx graph, if:
- it happened after self.cluster_start
- it's in the Supreme Court
- we haven't exceeded a max_depth of six cases.
- we haven't already followed this path
"""
g = networkx.Graph()
if hasattr(self, 'already_traversed'):
is_already_traversed = (root_authority.pk in self.visited_nodes)
else:
# First run. Create an empty list.
self.already_traversed = []
is_already_traversed = False
is_past_max_depth = (max_depth <= 0)
is_cluster_start_obj = (root_authority == self.cluster_start)
blocking_conditions = [
is_past_max_depth,
is_cluster_start_obj,
is_already_traversed,
]
if not any(blocking_conditions):
print " No blocking conditions. Pressing on."
self.visited_nodes.append(root_authority.pk)
for authority in root_authority.authorities.filter(
docket__court='scotus',
date_filed__gte=self.cluster_start.date_filed):
g.add_edge(root_authority.pk, authority.pk)
# Combine our present graph with the result of the next
# recursion
g = networkx.compose(g, self._build_graph(
authority,
max_depth - 1,
))
return g
def add_clusters(self):
"""Do the network analysis to add clusters to the model.
Process is to:
- Build a networkx graph
- For all nodes in the graph, add them to self.clusters
"""
self.already_traversed = []
g = self._traverse_all_nodes(
self.cluster_end,
max_depth=6,
)
self.already_traversed = []
查看:
How do I pass a variable by reference?
其中包含有关如何通过引用传递列表的示例。如果您通过引用传递列表,则每次调用您的函数都将引用同一个列表。