使用 hash-table 检查数组中是否有重复项

Check if there's a duplicate in an array using an hash-table

Goal: Check if there's a duplicate number in an array with the size of n.

基本上，如果我们可以使用散列-table（开放式散列，带链表），那么我们可以迭代数组并将数字插入到 table 中，其中包含一些值（可以1，并不重要）。迭代时，如果单元格不为空，则我们有一个重复的数字。

因为我们知道 read/write 的预期时间是 O(1) 那么算法的预期时间是 O(n).

问题 #1： 为什么最坏情况等于 O(nlogn)？
问题 #2：你会采用与建议的解决方案不同的做法吗？

在这里，我假设作者引用了散列的变体 table，其中每个 "bin" 都有一个 BST（或其他一些确定性 DS），因此在最坏的情况下，所有元素都被重复地插入到同一个容器中——这总共需要 O(nlogn)。
然而，hash tables 很少以这种方式实现，因为这种最坏的情况不太可能发生，并且在这种实现中实现了常规链表，对于这种情况 - 最坏的情况将是 O(n^2) 对于此解决方案。

解决此问题的另一种方法是排序，并迭代查找重复项（在排序数组中很容易），这是 O(nlogn) 内存使用量显着减少。

这个问题被称为 element distinctness problem，这两个选项（可能有一些变体）是解决它的方法。
已知 Omega(nlogn) 不使用额外的内存和散列。

使用 hash-table 检查数组中是否有重复项

Check if there's a duplicate in an array using an hash-table

arrays

algorithm

hash

time-complexity

data-structures