给定 n 个表示对的元组,return 一个包含连接元组的列表
Given n tuples representing pairs, return a list with connected tuples
给定 n 个元组,编写一个函数,该函数将 return 一个包含连接值的列表。
示例:
pairs = [(1,62),
(1,192),
(1,168),
(64,449),
(263,449),
(192,289),
(128,263),
(128,345),
(3,10),
(10,11)
]
结果:
[[1,62,192,168,289],
[64,449,263,128,345,449],
[3,10,11]]
我相信它可以使用图或树作为数据结构来解决,为每个值创建节点,并为每对创建边,每对树或图表示连接的值,但我还没有找到解决方案。
在 python 中产生一个结果的最佳方法是什么,该结果会产生这些对的连接值列表?
您可以通过 Disjoint Set (Union-Find) 实施来解决它。
用所有数字初始化结构djs
。然后对于每个元组(x,y)
,调用djs.merge(x,y)
。现在,对于每个数字 x
,为其创建一个新集合当且仅当 djs.sameSet(x,)==false
来自每个现有集合的任意 y
。
也许that可以帮到你。
你似乎有一个图(以边列表的形式)可能不是一个整体("connected")你想把它分成几个部分("components").
一旦我们用图形语言考虑它,我们就大功告成了。我们可以保留到目前为止我们找到的所有组件的列表(这些将是节点集),如果它的伙伴已经存在,则将一个节点添加到该集合中。否则,为这对制作一个新组件。
def graph_components(edges):
"""
Given a graph as a list of edges, divide the nodes into components.
Takes a list of pairs of nodes, where the nodes are integers.
Returns a list of sets of nodes (the components).
"""
# A list of sets.
components = []
for v1, v2 in edges:
# See if either end of the edge has been seen yet.
for component in components:
if v1 in component or v2 in component:
# Add both vertices -- duplicates will vanish.
component.add(v1)
component.add(v2)
break
else:
# If neither vertex is already in a component.
components.append({v1, v2})
return components
我使用了奇怪的 for ... else
构造来实现这个功能——如果 break
语句在 [=] 期间未达到,则执行 else
15=]。内部循环也可以是一个返回 True
或 False
.
的函数
编辑:正如 Francis Colas 指出的那样,这种方法太贪心了。这是一个完全不同的方法(非常感谢 Edward Mann this 漂亮的 DFS 实现)。
这种方法基于构建一个图,然后对其进行遍历,直到我们 运行 离开未访问的节点。它应该 运行 在线性时间内(O(n) 构建图形,O(n) 进行所有遍历,我相信 O(n) 只是做集合差异)。
from collections import defaultdict
def dfs(start, graph):
"""
Does depth-first search, returning a set of all nodes seen.
Takes: a graph in node --> [neighbors] form.
"""
visited, worklist = set(), [start]
while worklist:
node = worklist.pop()
if node not in visited:
visited.add(node)
# Add all the neighbors to the worklist.
worklist.extend(graph[node])
return visited
def graph_components(edges):
"""
Given a graph as a list of edges, divide the nodes into components.
Takes a list of pairs of nodes, where the nodes are integers.
"""
# Construct a graph (mapping node --> [neighbors]) from the edges.
graph = defaultdict(list)
nodes = set()
for v1, v2 in edges:
nodes.add(v1)
nodes.add(v2)
graph[v1].append(v2)
graph[v2].append(v1)
# Traverse the graph to find the components.
components = []
# We don't care what order we see the nodes in.
while nodes:
component = dfs(nodes.pop(), graph)
components.append(component)
# Remove this component from the nodes under consideration.
nodes -= component
return components
我不知道这个问题已经有了名字(感谢 avim!),所以我继续天真地解决了它。
此解决方案与 Eli Rose 的解决方案有些相似。不过,我决定 post 它,因为它对于大型对列表更有效,因为 lists_by_element
字典跟踪元素所在的列表,允许我们避免每次我们需要添加新项目时,遍历所有列表及其项目。
代码如下:
def connected_tuples(pairs):
# for every element, we keep a reference to the list it belongs to
lists_by_element = {}
def make_new_list_for(x, y):
lists_by_element[x] = lists_by_element[y] = [x, y]
def add_element_to_list(lst, el):
lst.append(el)
lists_by_element[el] = lst
def merge_lists(lst1, lst2):
merged_list = lst1 + lst2
for el in merged_list:
lists_by_element[el] = merged_list
for x, y in pairs:
xList = lists_by_element.get(x)
yList = lists_by_element.get(y)
if not xList and not yList:
make_new_list_for(x, y)
if xList and not yList:
add_element_to_list(xList, y)
if yList and not xList:
add_element_to_list(yList, x)
if xList and yList and xList != yList:
merge_lists(xList, yList)
# return the unique lists present in the dictionary
return set(tuple(l) for l in lists_by_element.values())
这是它的工作原理:http://ideone.com/tz9t7m
另一个比 wOlf 更紧凑但处理合并的解决方案与 Eli 的相反:
def connected_components(pairs):
components = []
for a, b in pairs:
for component in components:
if a in component:
for i, other_component in enumerate(components):
if b in other_component and other_component != component: # a, and b are already in different components: merge
component.extend(other_component)
components[i:i+1] = []
break # we don't have to look for other components for b
else: # b wasn't found in any other component
if b not in component:
component.append(b)
break # we don't have to look for other components for a
if b in component: # a wasn't in in the component
component.append(a)
break # we don't have to look further
else: # neither a nor b were found
components.append([a, b])
return components
请注意,当我在组件中找到元素时,我依赖于跳出循环,以便我可以使用循环的 else
子句来处理元素尚未在任何组件中的情况(如果循环结束时没有 break
,则执行 else
)。
我提出了 2 种不同的解决方案:
我更喜欢的第一个是将每条记录与父项链接起来。然后当然在层次结构中进一步导航,直到元素映射到自身。
所以代码是:
def build_mapping(input_pairs):
mapping = {}
for pair in input_pairs:
left = pair[0]
right = pair[1]
parent_left = None if left not in mapping else mapping[left]
parent_right = None if right not in mapping else mapping[right]
if parent_left is None and parent_right is None:
mapping[left] = left
mapping[right] = left
continue
if parent_left is not None and parent_right is not None:
if parent_left == parent_right:
continue
top_left_parent = mapping[parent_left]
top_right_parent = mapping[parent_right]
while top_left_parent != mapping[top_left_parent]:
mapping[left] = top_left_parent
top_left_parent = mapping[top_left_parent]
mapping[top_left_parent] = top_right_parent
mapping[left] = top_right_parent
continue
if parent_left is None:
mapping[left] = parent_right
else:
mapping[right] = parent_left
return mapping
def get_groups(input_pairs):
mapping = build_mapping(input_pairs)
groups = {}
for elt, parent in mapping.items():
if parent not in groups:
groups[parent] = set()
groups[parent].add(elt)
return list(groups.values())
因此,使用以下输入:
groups = get_groups([('A', 'B'), ('A', 'C'), ('D', 'A'), ('E', 'F'),
('F', 'C'), ('G', 'H'), ('I', 'J'), ('K', 'L'),
('L', 'M'), ('M', 'N')])
我们得到:
[{'A', 'B', 'C', 'D', 'E', 'F'}, {'G', 'H'}, {'I', 'J'}, {'K', 'L', 'M', 'N'}]
第二种可能效率较低的解决方案是:
def get_groups_second_method(input_pairs):
groups = []
for pair in input_pairs:
left = pair[0]
right = pair[1]
left_group = None
right_group = None
for i in range(0, len(groups)):
group = groups[i]
if left in group:
left_group = (group, i)
if right in group:
right_group = (group, i)
if left_group is not None and right_group is not None:
merged = right_group[0].union(left_group[0])
groups[right_group[1]] = merged
groups.pop(left_group[1])
continue
if left_group is None and right_group is None:
new_group = {left, right}
groups.append(new_group)
continue
if left_group is None:
right_group[0].add(left)
else:
left_group[0].add(right)
return groups
您也可以使用 networkx 作为依赖项。
import networkx as nx
pairs = [(1,62),
(1,192),
(1,168),
(64,449),
(263,449),
(192,289),
(128,263),
(128,345),
(3,10),
(10,11)]
G = nx.Graph()
G.add_edges_from(pairs)
list(nx.connected_components(G))
给定 n 个元组,编写一个函数,该函数将 return 一个包含连接值的列表。
示例:
pairs = [(1,62),
(1,192),
(1,168),
(64,449),
(263,449),
(192,289),
(128,263),
(128,345),
(3,10),
(10,11)
]
结果:
[[1,62,192,168,289],
[64,449,263,128,345,449],
[3,10,11]]
我相信它可以使用图或树作为数据结构来解决,为每个值创建节点,并为每对创建边,每对树或图表示连接的值,但我还没有找到解决方案。
在 python 中产生一个结果的最佳方法是什么,该结果会产生这些对的连接值列表?
您可以通过 Disjoint Set (Union-Find) 实施来解决它。
用所有数字初始化结构djs
。然后对于每个元组(x,y)
,调用djs.merge(x,y)
。现在,对于每个数字 x
,为其创建一个新集合当且仅当 djs.sameSet(x,)==false
来自每个现有集合的任意 y
。
也许that可以帮到你。
你似乎有一个图(以边列表的形式)可能不是一个整体("connected")你想把它分成几个部分("components").
一旦我们用图形语言考虑它,我们就大功告成了。我们可以保留到目前为止我们找到的所有组件的列表(这些将是节点集),如果它的伙伴已经存在,则将一个节点添加到该集合中。否则,为这对制作一个新组件。
def graph_components(edges):
"""
Given a graph as a list of edges, divide the nodes into components.
Takes a list of pairs of nodes, where the nodes are integers.
Returns a list of sets of nodes (the components).
"""
# A list of sets.
components = []
for v1, v2 in edges:
# See if either end of the edge has been seen yet.
for component in components:
if v1 in component or v2 in component:
# Add both vertices -- duplicates will vanish.
component.add(v1)
component.add(v2)
break
else:
# If neither vertex is already in a component.
components.append({v1, v2})
return components
我使用了奇怪的 for ... else
构造来实现这个功能——如果 break
语句在 [=] 期间未达到,则执行 else
15=]。内部循环也可以是一个返回 True
或 False
.
编辑:正如 Francis Colas 指出的那样,这种方法太贪心了。这是一个完全不同的方法(非常感谢 Edward Mann this 漂亮的 DFS 实现)。
这种方法基于构建一个图,然后对其进行遍历,直到我们 运行 离开未访问的节点。它应该 运行 在线性时间内(O(n) 构建图形,O(n) 进行所有遍历,我相信 O(n) 只是做集合差异)。
from collections import defaultdict
def dfs(start, graph):
"""
Does depth-first search, returning a set of all nodes seen.
Takes: a graph in node --> [neighbors] form.
"""
visited, worklist = set(), [start]
while worklist:
node = worklist.pop()
if node not in visited:
visited.add(node)
# Add all the neighbors to the worklist.
worklist.extend(graph[node])
return visited
def graph_components(edges):
"""
Given a graph as a list of edges, divide the nodes into components.
Takes a list of pairs of nodes, where the nodes are integers.
"""
# Construct a graph (mapping node --> [neighbors]) from the edges.
graph = defaultdict(list)
nodes = set()
for v1, v2 in edges:
nodes.add(v1)
nodes.add(v2)
graph[v1].append(v2)
graph[v2].append(v1)
# Traverse the graph to find the components.
components = []
# We don't care what order we see the nodes in.
while nodes:
component = dfs(nodes.pop(), graph)
components.append(component)
# Remove this component from the nodes under consideration.
nodes -= component
return components
我不知道这个问题已经有了名字(感谢 avim!),所以我继续天真地解决了它。
此解决方案与 Eli Rose 的解决方案有些相似。不过,我决定 post 它,因为它对于大型对列表更有效,因为 lists_by_element
字典跟踪元素所在的列表,允许我们避免每次我们需要添加新项目时,遍历所有列表及其项目。
代码如下:
def connected_tuples(pairs):
# for every element, we keep a reference to the list it belongs to
lists_by_element = {}
def make_new_list_for(x, y):
lists_by_element[x] = lists_by_element[y] = [x, y]
def add_element_to_list(lst, el):
lst.append(el)
lists_by_element[el] = lst
def merge_lists(lst1, lst2):
merged_list = lst1 + lst2
for el in merged_list:
lists_by_element[el] = merged_list
for x, y in pairs:
xList = lists_by_element.get(x)
yList = lists_by_element.get(y)
if not xList and not yList:
make_new_list_for(x, y)
if xList and not yList:
add_element_to_list(xList, y)
if yList and not xList:
add_element_to_list(yList, x)
if xList and yList and xList != yList:
merge_lists(xList, yList)
# return the unique lists present in the dictionary
return set(tuple(l) for l in lists_by_element.values())
这是它的工作原理:http://ideone.com/tz9t7m
另一个比 wOlf 更紧凑但处理合并的解决方案与 Eli 的相反:
def connected_components(pairs):
components = []
for a, b in pairs:
for component in components:
if a in component:
for i, other_component in enumerate(components):
if b in other_component and other_component != component: # a, and b are already in different components: merge
component.extend(other_component)
components[i:i+1] = []
break # we don't have to look for other components for b
else: # b wasn't found in any other component
if b not in component:
component.append(b)
break # we don't have to look for other components for a
if b in component: # a wasn't in in the component
component.append(a)
break # we don't have to look further
else: # neither a nor b were found
components.append([a, b])
return components
请注意,当我在组件中找到元素时,我依赖于跳出循环,以便我可以使用循环的 else
子句来处理元素尚未在任何组件中的情况(如果循环结束时没有 break
,则执行 else
)。
我提出了 2 种不同的解决方案:
我更喜欢的第一个是将每条记录与父项链接起来。然后当然在层次结构中进一步导航,直到元素映射到自身。
所以代码是:
def build_mapping(input_pairs):
mapping = {}
for pair in input_pairs:
left = pair[0]
right = pair[1]
parent_left = None if left not in mapping else mapping[left]
parent_right = None if right not in mapping else mapping[right]
if parent_left is None and parent_right is None:
mapping[left] = left
mapping[right] = left
continue
if parent_left is not None and parent_right is not None:
if parent_left == parent_right:
continue
top_left_parent = mapping[parent_left]
top_right_parent = mapping[parent_right]
while top_left_parent != mapping[top_left_parent]:
mapping[left] = top_left_parent
top_left_parent = mapping[top_left_parent]
mapping[top_left_parent] = top_right_parent
mapping[left] = top_right_parent
continue
if parent_left is None:
mapping[left] = parent_right
else:
mapping[right] = parent_left
return mapping
def get_groups(input_pairs):
mapping = build_mapping(input_pairs)
groups = {}
for elt, parent in mapping.items():
if parent not in groups:
groups[parent] = set()
groups[parent].add(elt)
return list(groups.values())
因此,使用以下输入:
groups = get_groups([('A', 'B'), ('A', 'C'), ('D', 'A'), ('E', 'F'),
('F', 'C'), ('G', 'H'), ('I', 'J'), ('K', 'L'),
('L', 'M'), ('M', 'N')])
我们得到:
[{'A', 'B', 'C', 'D', 'E', 'F'}, {'G', 'H'}, {'I', 'J'}, {'K', 'L', 'M', 'N'}]
第二种可能效率较低的解决方案是:
def get_groups_second_method(input_pairs):
groups = []
for pair in input_pairs:
left = pair[0]
right = pair[1]
left_group = None
right_group = None
for i in range(0, len(groups)):
group = groups[i]
if left in group:
left_group = (group, i)
if right in group:
right_group = (group, i)
if left_group is not None and right_group is not None:
merged = right_group[0].union(left_group[0])
groups[right_group[1]] = merged
groups.pop(left_group[1])
continue
if left_group is None and right_group is None:
new_group = {left, right}
groups.append(new_group)
continue
if left_group is None:
right_group[0].add(left)
else:
left_group[0].add(right)
return groups
您也可以使用 networkx 作为依赖项。
import networkx as nx
pairs = [(1,62),
(1,192),
(1,168),
(64,449),
(263,449),
(192,289),
(128,263),
(128,345),
(3,10),
(10,11)]
G = nx.Graph()
G.add_edges_from(pairs)
list(nx.connected_components(G))