Java: 如何在哈希数组映射树 (HAMT) 中插入期间执行哈希冲突缓解?
Java: How to perform hash collision mitigation during insertion in Hashed Array Mapped Tree (HAMT)?
我目前正在实现一个Hashed Array-Mapped Table (HAMT) in Java and I've run into the following issue. When inserting new key-value pairs, there will obviously be collisions. In the original paper,作者建议:
The existing key is then inserted in the new sub-hash table and the new key added. Each time 5 more bits of the hash are used the probability of a collision reduces by a factor of 1/32. Occasionally an entire 32 bit hash may be consumed and a new one must be computed to differentiate the two keys.
...还有:
The hash function was tailored to give a 32 bit hash. The algorithm requires that the hash can be extended to an arbitrary number of bits. This was accomplished by rehashing the key combined with an integer representing the trie level, zero being the root. Hence if two keys do give the same initial hash then the rehash has a probability of 1 in 2^32 of a further collision.
所以我在 Java 中用字符串尝试了这个。据了解:
"Ea".hashCode() == "FB".hashCode(); // true
... 因为 String#hashCode()
算法。遵循论文中的建议并使用树深度扩展字符串以产生另一个非冲突哈希码很遗憾不起作用:
"Ea1".hashCode() == "FB1".hashCode(); // :( still the same!!
以上内容适用于您可能用来连接字符串的任何整数,它们的哈希码将总是 冲突。
我的问题是:你是如何解决这种情况的?有一个非常相似的问题this answer,但讨论中没有真正的解决方案。那么我们如何做到这一点...?
您必须实施 equals()
方法来比较值是否相等。
Hashcode只是将数据整理成数据集合,对binarySearch的工作很有用。但是 hashcode()
没有 equals()
.
什么都不是
我目前正在实现一个Hashed Array-Mapped Table (HAMT) in Java and I've run into the following issue. When inserting new key-value pairs, there will obviously be collisions. In the original paper,作者建议:
The existing key is then inserted in the new sub-hash table and the new key added. Each time 5 more bits of the hash are used the probability of a collision reduces by a factor of 1/32. Occasionally an entire 32 bit hash may be consumed and a new one must be computed to differentiate the two keys.
...还有:
The hash function was tailored to give a 32 bit hash. The algorithm requires that the hash can be extended to an arbitrary number of bits. This was accomplished by rehashing the key combined with an integer representing the trie level, zero being the root. Hence if two keys do give the same initial hash then the rehash has a probability of 1 in 2^32 of a further collision.
所以我在 Java 中用字符串尝试了这个。据了解:
"Ea".hashCode() == "FB".hashCode(); // true
... 因为 String#hashCode()
算法。遵循论文中的建议并使用树深度扩展字符串以产生另一个非冲突哈希码很遗憾不起作用:
"Ea1".hashCode() == "FB1".hashCode(); // :( still the same!!
以上内容适用于您可能用来连接字符串的任何整数,它们的哈希码将总是 冲突。
我的问题是:你是如何解决这种情况的?有一个非常相似的问题this answer,但讨论中没有真正的解决方案。那么我们如何做到这一点...?
您必须实施 equals()
方法来比较值是否相等。
Hashcode只是将数据整理成数据集合,对binarySearch的工作很有用。但是 hashcode()
没有 equals()
.