Hashtable 与 HashMap 中的哈希函数?

Hashing function in Hashtable vs. HashMap?

我知道Hashtable和HashMap的区别。然而,这两个 类 似乎都在使用 哈希函数 来完成工作。 Hashtable使用的哈希函数和HashMap使用的哈希函数有区别吗?

特别是他们使用的哈希算法有区别吗?这两个类的哈希公式是什么?

也就是说,索引(哈希值)的计算方式不同吗?

java.util.Hashtable<K,V> 就像 java.util.Vector<T>。它是在开发早期添加到 SDK 的 class,已被 HashMap<K,V> 取代(因为 ArrayList<T> 取代了 Vector<T>)。

所以你根本不应该使用它,除非你需要所有操作的隐式同步,默认情况下 Hashtable,但你仍然可以为此目的使用 Collections.synchronizedMapConcurrentHashMap<K,V>.

如 Javadoc 中所述:

As of the Java 2 platform v1.2, this class was retrofitted to implement the Map interface, making it a member of the Java Collections Framework. Unlike the new collection implementations, Hashtable is synchronized. If a thread-safe implementation is not needed, it is recommended to use HashMap in place of Hashtable. If a thread-safe highly-concurrent implementation is desired, then it is recommended to use ConcurrentHashMap in place of Hashtable.

两个 class 的散列应该相同,因为它们都将使用 int Object::hashCode 来达到目的。

In particular, is there a difference between the hashing algorithm they use? What is the formula used to hash in these two classes?

将对象用作散列 table 键时使用的主要散列函数是对象的 hashCode() 方法。关键是 class 来实现一个像样的散列函数。

HashtableHashMap classes 获取键的哈希码值并将其转换为主哈希table 链数组中的索引。但是,HashtableHashMap 之间发生这种情况的方式有所不同。

  • 对于Hashtable (Java 8) 代码是这样的:

     hash = key.hashCode();
     index = (hash & 0x7FFFFFFF) % tab.length;
    
  • 对于 HashMap (Java 8) 代码(有效地)是这样的:

     // (I have restructured the code for ease of comparison.)
     int h;
     hash = (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
     index = (tab.length - 1) & hash;
    

如您所见,HashMap正在对键的哈希码函数返回的哈希码值进行加扰。这个在源码中是这样解释的:

[This method] computes key.hashCode() and spreads (XORs) higher bits of hash to lower. Because the table uses power-of-two masking, sets of hashes that vary only in bits above the current mask will always collide. (Among known examples are sets of Float keys holding consecutive whole numbers in small tables.) So we apply a transform that spreads the impact of higher bits downward. There is a tradeoff between speed, utility, and quality of bit-spreading. Because many common sets of hashes are already reasonably distributed (so don't benefit from spreading), and because we use trees to handle large sets of collisions in bins, we just XOR some shifted bits in the cheapest possible way to reduce systematic lossage, as well as to incorporate impact of the highest bits that would otherwise never be used in index calculations because of table bounds.

备注:

  1. &% 的区别是因为在 Hashtable 中哈希数组大小是质数,但在 HashMap 中(Java 8) 大小是2的幂.

  2. 在Java 8 HashMap中,如果键class实现了Comparable,实现会将长散列链变成二叉树。

  3. HashMap 处理 null 键,但 Hashtable 不处理。


然而,HashMap 中所有这些额外的复杂性只有在您的密钥 class 设计/实施不当的 hashCode() 方法时才会发挥作用……或者如果有人故意试图设计哈希冲突。

换句话说,如果您的密钥 class 设计得很好,差异 应该无关紧要