在 JRE 中使用什么算法将 ArrayList<T> 转换为 LinkedHashSet<T>

Question

我想从具有重复元素的 list 中获取唯一元素的 list 并且应保持元素在列表中出现的顺序。

为了实现这一点，我可以编写如下算法：

private ArrayList<T> getUnique(ArrayList<T> list)
{
    // maintain a hashmap of numbers and a uniqueList to be returned(ArrayList<T>)
    // Add element in result list and the hashmap if the element isn't already present in the hashmap, else just add in the hashmap

    HashMap<T, Boolean> map = new HashMap<>();
    ArrayList<T> uniqueList = new ArrayList<>();

    for (T t: list)
    {
        if (map.get(t) == null)
        {
            // t wasn't present so, adding them in map as well as in the list
            map.put(t, true);
            uniqueList.add(t);
        }
    }
    return uniqueList;
}

此算法将花费 O(n) 时间和 O(n) 额外的 space（对于 HashMap）。

或者简单地说，我可以使用以下语法：

Set<T> set = new LinkedHashSet<>(list);

上述Java中的语法用于从list中获取set个唯一元素，元素出现顺序与list相同。然后将此集合转换为列表。 (ArrayList<T> uniqueList = new ArrayList<>(set);)

我假设这里的时间复杂度也是O(n)。我想知道 Java 使用什么算法。

我看到 class 被命名为 LinkedHashSet，所以我认为他们可能使用了一些 LinkedList 概念来实现这一点，所以我查看了源代码，发现了这些东西：

在 LinkedHashSet.java 中，构造函数如下所示：

143: public LinkedHashSet(Collection<? extends T> c) 144: { 145: super(c); 146: } here 是来源。

所以，我查看了 parent class 构造函数，即 HashSet，我发现：

public HashSet(Collection<? extends E> c) { map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16)); addAll(c); }

接下来，我搜索了addAll方法，我在AbstractCollection class中找到了它（这是HashSet [=76的grandparent =]），函数的定义是：

public boolean addAll(Collection<? extends E> c) { boolean modified = false; for (E e : c) if (add(e)) modified = true; return modified; }

这是在调用 add，类似于：

public boolean add(E e) { throw new UnsupportedOperationException(); } here.

我无法理解这一点。他们使用什么算法来完成这项任务？

Answer 1

为了解决您的困惑，add 方法在 HashSet 中被重写如下：

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

请注意 LinkedHashSet 扩展 HashSet 扩展 AbstractSet 扩展 AbstractCollection.

综上所述，使用的算法是：

    for (E e : c)
        add(e);

对于 LinkedHashSet 是 O(N) 因为 add 对于 LinkedHashSet 的平均复杂度是 O(1).

Answer 2

这是LinkedHashSet构造函数：

public LinkedHashSet(Collection<? extends E> c) {
        super(Math.max(2*c.size(), 11), .75f, true);
        addAll(c);
    }

这是 java.util.AbstractCollection:

的 addAll 函数

public boolean addAll(Collection<? extends E> c) {
        boolean modified = false;
        for (E e : c)
            if (add(e))
                modified = true;
        return modified;
    }

这是来自 java.util.HashSet 的添加函数：

public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

如果您使用 Intellij 查找函数的源代码，那么容易。

Answer 3

对于寻找整个故事的人

基于 LinkedHashSet, HashSet, LinkedHashMap 的源代码。当构造一个 LinkedHashSet 扩展 HashSet 与其他集合（LinkedHashSet.java 第 143 行）时，

public LinkedHashSet(Collection<? extends T> c)  
{  
  super(c);  
}

哪个会调用（HashSet.java 第 136 行）：

public HashSet(Collection<? extends T> c)
{
  this(Math.max(2 * c.size(), HashMap.DEFAULT_CAPACITY));
  addAll(c);
}

然后调用（HashSet.java第122行）：

public HashSet(int initialCapacity, float loadFactor)
{
  map = init(initialCapacity, loadFactor);
}

由于 init 方法在 LinkedHashSet

中被覆盖

HashMap<T, String> init(int capacity, float load)
{
 return new LinkedHashMap<T, String>(capacity, load);
}

后盾 map 是 LinkedHashMap。

根据 LinkedHashMap

的 java 文档

This class provides all of the optional Map operations, and permits null elements. Like HashMap, it provides constant-time performance for the basic operations (add, contains and remove), assuming the hash function disperses elements properly among the buckets. Performance is likely to be just slightly below that of HashMap, due to the added expense of maintaining the linked list, with one exception: Iteration over the collection-views of a LinkedHashMap requires time proportional to the size of the map, regardless of its capacity. Iteration over a HashMap is likely to be more expensive, requiring time proportional to its capacity.

而HashSet的add方法是

public boolean add(E e) {
   return map.put(e, PRESENT)==null;
}

因此构造的平均时间复杂度为O(n)。算法方面，我想你可以去看看LinkedHashMap的代码了解一下。延伸阅读 How is the internal implementation of LinkedHashMap different from HashMap implementation?, HashSet vs LinkedHashSet

在 JRE 中使用什么算法将 ArrayList<T> 转换为 LinkedHashSet<T>

What algorithm is used for converting an ArrayList<T> to a LinkedHashSet<T> in JRE

java

algorithm

hashmap

set

linkedhashset

对于寻找整个故事的人