Java HashSet实现设计

Question

编辑问题去掉我的类比直接提问。 JDK HashSet 实现如下：

public class HashSet {
  private HashMap map;
  public HashSet(int capacity) {
    map = new HashMap(capacity);
  }

  public HashSet(int capacity, float loadFactor) {
    map = new HashMap(capacity, loadFactor);
  }

  HashSet(int capacity, float loadFactor, boolean dummy) {
    map = new LinkedHashMap<>(capacity, loadFactor);
  }
}

而LinkedHashSet是这样实现的：

public class LinkedHashSet
  extends HashSet {
  public LinkedHashSet(int capacity) {
    super(capacity, 0, true);
  }
  public LinkedHashSet(int capacity, float loadFactor) {
    super(capacity, loadFactor, true);
  }
}

class HashSet 中的第三个构造函数：HashSet(int capacity, loadFactor, boolean dummy) 的存在只是为了帮助 LinkedHashSet class 使用 LinkedHashMap 作为支持映射而不是默认 HashMap

问题：

这会被认为是一个好的设计吗？
让 subclass 指定支持 map 类型不是更好吗？
我花了30分钟才在JDK源码中定位到LinkedHashSet的双向链表到底在哪里因为我没有直观的想到上面的实现范式，什么时候这样的设计是个不错的选择？

Answer 1

你是对的，这不是最佳设计选择。

总结一下：

HashSet 有 3 个构造函数：

/**
 * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
 * the specified initial capacity and the specified load factor.
 *
 * @param      initialCapacity   the initial capacity of the hash map
 * @param      loadFactor        the load factor of the hash map
 * @throws     IllegalArgumentException if the initial capacity is less
 *             than zero, or if the load factor is nonpositive
 */
public HashSet(int initialCapacity, float loadFactor) {
    map = new HashMap<>(initialCapacity, loadFactor);
}

/**
 * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
 * the specified initial capacity and default load factor (0.75).
 *
 * @param      initialCapacity   the initial capacity of the hash table
 * @throws     IllegalArgumentException if the initial capacity is less
 *             than zero
 */
public HashSet(int initialCapacity) {
    map = new HashMap<>(initialCapacity);
}

/**
 * Constructs a new, empty linked hash set.  (This package private
 * constructor is only used by LinkedHashSet.) The backing
 * HashMap instance is a LinkedHashMap with the specified initial
 * capacity and the specified load factor.
 *
 * @param      initialCapacity   the initial capacity of the hash map
 * @param      loadFactor        the load factor of the hash map
 * @param      dummy             ignored (distinguishes this
 *             constructor from other int, float constructor.)
 * @throws     IllegalArgumentException if the initial capacity is less
 *             than zero, or if the load factor is nonpositive
 */
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
    map = new LinkedHashMap<>(initialCapacity, loadFactor);
}

最后一个有一个额外的参数 dummy，它没有被使用，只是为了让编译器区分具有相同参数的 2 个构造函数。唯一的区别是更改了支持映射实现。

自从编写了这些类之后，我们在对象设计方面取得了长足的进步。

如果今天重写它，可能会有一个采用 Map 的构造函数，以便任何 Map 实现都可以用作后备存储：

HashSet(Map<K,V> backingMap);

And/or 可以有多个名称不同但参数相同的静态工厂方法

 public static HashSet create(int initialCapacity, float loadFactor)

 public static HashSet createLinked(int initialCapacity, float loadFactor)

编辑

JB Nizet 提出了一个有趣的观点。 HashSet 的 readObject() 方法在从 ObjectInputStream 重建对象以查看 "this" 实例是否属于 LinkedHashSet 时具有显式代码。

// Create backing HashMap
    map = (((HashSet<?>)this) instanceof LinkedHashSet ?
           new LinkedHashMap<E,Object>(capacity, loadFactor) :
           new HashMap<E,Object>(capacity, loadFactor));

这真的是可能的，因为构造函数的虚拟参数版本是包私有的，所以这些是目前仅有的 2 种可能的实现。如果没有这种技术，您将无法正确使用 ReadObject()。

这可能就是 Josh Bloch 在 Effective Java 中写连载章节的原因。您可能必须使用 SerializationProxy（项目 78）才能正确读取 LinkedHashSet。

Java HashSet实现设计

Java HashSet Implementation design

java

oop

hashset

linkedhashset