洗牌后的不同结果

Question

我想把一个不完整的图分成独立的、不相连的物体。图的边在列表 edges 中。

代码在打乱边的顺序后给出了不同的结果。这是为什么？

from random import shuffle

edges = [('7', '9'), ('2', '8'), ('4', '10'), ('5', '9'), ('1', '2'), ('1', '6'), ('6', '10')]
bodylist = []
shuffle(edges)

for edge in edges:
    #If at least one node of the edge is anywhere in bodylist, append the new nodes to that list.
    try:
        index = [i for i, body in enumerate(bodylist) if edge[0] in body or edge[1] in body][0]
        bodylist[index].append(edge[0])
        bodylist[index].append(edge[1])
    #If not, make a new list containing the new nodes.
    except:
        bodylist.append([edge[0], edge[1]])

print([set(x) for x in bodylist])

预期输出：[{'1', '2', '8', '4', '6', '10'}, {'9', '5', '7'}]

一些实际输出：[{'9', '5', '7'}, {'1', '2', '8'}, {'10', '4', '6', '1'}]

[{'9', '7', '5'}, {'6', '2', '1', '8'}, {'6', '10', '4'}]

请注意，预期的输出也会不时出现。（应该一直如此）

我也会欣赏不同的方法，因为这个可能不是最好的方法。

Answer 1

这是因为你的算法有误。您的算法的问题在于它取决于我们开始创建主体列表时使用的边。为了解释这一点，让我们举一个简单的例子，一个有 4 条边的图 - edges = [('1','2'),('2','3'),('3','4'),('1','4')] .

第一个案例 -

>>> edges = [('1','2'),('2','3'),('3','4'),('1','4')]
>>> bodylist = []
>>> for edge in edges:
...     #If at least one node of the edge is anywhere in bodylist, append the new nodes to that list.
...     try:
...         index = [i for i, body in enumerate(bodylist) if edge[0] in body or edge[1] in body][0]
...         bodylist[index].append(edge[0])
...         bodylist[index].append(edge[1])
...     #If not, make a new list containing the new nodes.
...     except:
...         bodylist.append([edge[0], edge[1]])
...
>>> print([set(x) for x in bodylist])
[{'4', '1', '3', '2'}]

你得到一个带有顶点的 body - 1, 2, 3, 4 。为什么？

因为您从 (1,2) 开始将其添加到 body 列表中。然后你拿了 (2,3) ，你看到 2 已经存在于 body 列表中添加的项目中，你再次将它添加到同一个列表中，然后继续并且所有都添加到相同的 body.

现在，让我们以不同的顺序获取相同的边 - edges = [('1','2'),('3','4'),('2','3'),('1','4')]。结果是 -

>>> edges = [('1','2'),('3','4'),('2','3'),('1','4')]
>>> bodylist = []
>>> .... #same logic
>>> print([set(x) for x in bodylist])
[{'4', '1', '3', '2'}, {'4', '3'}]

正如你这次看到的，你有两个不同的身体（显然它们是错误的）为什么？

你再次从 (1,2) 开始，将其作为 body 添加到 body 列表中，然后你使用 (3,4) ，检查这个，你会看到 none 的顶点已经存在于任何 body 中，因此您将其添加到单独的 body 中。以第三个元素 (2,3) 为例，您会看到它在第一个和第二个 body 中都存在，但您的逻辑是只采用第一个 body 并将元素添加到其中。（你知道你错在哪里了吗？）

这就是为什么您在打乱列表时得到不同结果的原因，因为顺序对您的逻辑很重要（这是错误的）。

你需要做的是，如果你在多个物体中找到一条边的顶点，你需要将两个物体合并成一个body。

另一个建议，我们不需要将列表添加到 bodylist 中，而是我们可以为每个 body 使用 sets。

示例解决方案如下所示 -

from random import shuffle

edges = [('7', '9'), ('2', '8'), ('4', '10'), ('5', '9'), ('1', '2'), ('1', '6'), ('6', '10')]
bodylist = []
shuffle(edges)

for edge in edges:
    bodies = [body for i, body in enumerate(bodylist) if edge[0] in body or edge[1] in body]
    if len(bodies) > 0:
        tempset = {edge[0],edge[1]}
        for x in bodies:
            tempset.update(x)
            print('tempset :',tempset)
            bodylist.remove(x)
        bodylist.append(tempset)
    else:
        bodylist.append({edge[0],edge[1]})

print([set(x) for x in bodylist])

Answer 2

假设您有三个边 [(1, 2), (3, 4), (2, 3)]。这描述了一个连通图。

但是，您的代码将首先检查 (1, 2)，未发现任何内容，因此将其添加到 bodylist。

然后，它将查找 (3, 4)，找不到 3 或 4，因此将其添加为新列表。

最后它会将(2, 3)添加到第一个列表但是它不会回来修正它的错误，它不会意识到(3, 4)属于同一个body。

为了解决这个问题，您可以在每次向主体添加新边时完全遍历剩余的边，以检查是否存在连接：

while edges:
    current_edge = edges.pop(0)
    body = {current_edge[0], current_edge[1]}
    i = 0
    while i < len(edges):
        if edges[i][0] in body or edges[i][1] in body:
            body.add(edges[i][0])
            body.add(edges[i][1])
            edges.pop(i) # Edge added, no need to check it again
            i = 0 # Restart the loop
        else:
            i += 1
    bodylist.append(body)

你要找的是一个叫做connected component的图。

如果您正在寻找高效的算法，您应该看看 this answer。

洗牌后的不同结果

Different result upon shuffling a list

python

algorithm

shuffle

graph

python-3.x