在 Kruskal 算法中使用 union-find 是否真的会影响最坏情况下的运行时间？

Does using union-find in Kruskal's algorithm actually affect worst-case runtime?

所以我正在自学一些图形算法，现在在 Kruskal 上，并且了解到建议使用 union-find 因此检查添加边是否创建循环只需要 O(Log V) 时间。出于实际目的，我明白你为什么想要这样做，但严格看大 O 符号，这样做真的会影响最坏情况的复杂性吗？

我的推理：如果我们使用 DFS 而不是 union find 来检查循环，那么运行时间将是 O(E+V)，并且您必须执行 V 次，运行时间为 O( V^2 + V）。它比 union find 复杂度更高，复杂度为 O(V * LogV)，但 Kruskal 的复杂度主要来自删除优先级队列 E 次的最小元素，即 O(E * logE)，Big O回答。我也没有真正看到 space 的优势，因为 union-find 需要 O(V) space 并且您需要维护以使用 DFS 查找循环的数据结构也是如此。

所以对一个简单问题的解释可能过于冗长：在 Kruskal 算法中使用 union-find 实际上会影响最坏情况下的运行时间吗？

and understand that it's recommended to use union-find so checking whether adding an edge creates a cycle only takes O(Log V) time

这是不对的。使用 union find 是 O(alpha(n) * m)，其中 alpha(n) 是 Ackermann 函数的反函数，并且出于所有意图和目的，可以被视为常数。比对数快得多：

Since alpha(n) is the inverse of this function, alpha(n) is less than 5 for all remotely practical values of n. Thus, the amortized running time per operation is effectively a small constant.

but the bulk of the complexity of Kruskal's comes from deleting the minimum element of the priority queue E times

这也是错误的。 Kruskal's algorithm 不涉及使用任何优先级队列。它涉及在开始时按成本对边缘进行排序。尽管复杂性仍然是您在这一步中提到的那个。但是，排序在实践中可能比优先级队列更快（使用优先级队列最多相当于堆排序，这不是最快的排序算法）。

底线，如果m是边数，n是节点数。:

边排序：O(m log m).
对于每条边，调用union-find：O(m * alpha(n))，或者基本上只是O(m).
总复杂度：O(m log m + m * alpha(n)).
如果您不使用联合查找，如果我们使用您的 O(n + m) 循环查找算法，则总复杂度将为 O(m log m + m * (n + m))。尽管此步骤的 O(n + m) 可能是轻描淡写，因为您还必须以某种方式更新您的结构（插入一条边）。朴素的不相交集算法实际上是O(n log n)，所以更糟。

注意：在这种情况下，如果您愿意，可以写log n而不是log m，因为m = O(n^2)和log(n^2) = 2log n.

结论：是的，union-find 帮助很多。

即使您使用 union-find 的 O(log n) 变体，这会导致 O(m log m + m log n) 总复杂度，您可以将其同化为 O(m log m)，实际上您更愿意如果可以的话，让第二部分更快。由于 union-find 非常容易实现，没有理由不这样做。

在 Kruskal 算法中使用 union-find 是否真的会影响最坏情况下的运行时间？

Does using union-find in Kruskal's algorithm actually affect worst-case runtime?

algorithm

time-complexity

graph-algorithm

data-structures