更新存储迭代器时的ConcurrentModificationException（用于LRU缓存实现）

Question

我正在尝试实现我自己的 LRU 缓存。是的，我知道 Java 为此目的提供了一个 LinkedHashMap，但我正在尝试使用基本数据结构来实现它。

通过阅读本主题，我了解到我需要一个用于 O(1) 查找键的 HashMap 和一个用于管理 "least recently used" 逐出策略的链表。我发现这些引用都使用标准库 hashmap 但实现了自己的链表：

"LRU缓存常用哪些数据结构并快速定位对象？" (whosebug.com)
"What is the best way to Implement a LRU Cache?" (quora.com)
"Implement a LRU Cache in C++" (uml.edu)
"LRU Cache (Java)" (programcreek.com)

散列table 应该直接存储链表节点，如下所示。我的缓存应该存储整数键和字符串值。

但是，在 Java 中，LinkedList 集合不公开其内部节点，因此我无法将它们存储在 HashMap 中。我可以改为让 HashMap 将索引存储到 LinkedList 中，但是获取一个项目需要 O(N) 时间。所以我尝试存储一个 ListIterator。

import java.util.Map;
import java.util.HashMap;
import java.util.List;
import java.util.LinkedList;
import java.util.ListIterator;

public class LRUCache {

    private static final int DEFAULT_MAX_CAPACITY = 10;

    protected Map<Integer, ListIterator> _map = new HashMap<Integer, ListIterator>();
    protected LinkedList<String> _list = new LinkedList<String>();

    protected int _size = 0;
    protected int _maxCapacity = 0;

    public LRUCache(int maxCapacity) {
        _maxCapacity = maxCapacity;
    }

    // Put the key, value pair into the LRU cache.
    // The value is placed at the head of the linked list.
    public void put(int key, String value) {

        // Check to see if the key is already in the cache.
        ListIterator iter = _map.get(key);

        if (iter != null) {
            // Key already exists, so remove it from the list.
            iter.remove(); // Problem 1: ConcurrentModificationException!
        }

        // Add the new value to the front of the list.
        _list.addFirst(value);
        _map.put(key, _list.listIterator(0));

        _size++;

        // Check if we have exceeded the capacity.
        if (_size > _maxCapacity) {
            // Remove the least recently used item from the tail of the list.
            _list.removeLast();
        }
    }

    // Get the value associated with the key.
    // Move value to the head of the linked list.
    public String get(int key) {

        String result = null;
        ListIterator iter = _map.get(key);

        if (iter != null) {

            //result = iter
            // Problem 2: HOW DO I GET THE STRING FROM THE ITERATOR?

        }

        return result;
    }

    public static void main(String argv[]) throws Exception {
        LRUCache lruCache = new LRUCache(10);

        lruCache.put(10, "This");
        lruCache.put(20, "is");
        lruCache.put(30, "a");
        lruCache.put(40, "test");
        lruCache.put(30, "some"); // Causes ConcurrentModificationException
    }
}

所以这会导致三个问题：

问题 1：当我使用存储在 HashMap 中的迭代器更新 LinkedList 时，出现了 ConcurrentModificationException。

Exception in thread "main" java.util.ConcurrentModificationException
    at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:953)
    at java.util.LinkedList$ListItr.remove(LinkedList.java:919)
    at LRUCache.put(LRUCache.java:31)
    at LRUCache.main(LRUCache.java:71)

问题 2。如何检索 ListIterator 指向的值？看来我只能检索 next() 值。

问题 3。有什么方法可以使用 Java collections LinkedList 实现这个 LRU 缓存，还是我真的必须实现自己的链表？

Answer 1

1) 这并不是迭代器的真正用途。

根据约定，如果您mod在不使用迭代器的情况下验证列表——就像您在此处所做的那样

_list.addFirst(value);

那么该列表中的所有打开的迭代器都应该抛出 ConcurrentModificationException。他们对不再存在的列表版本持开放态度。

2) LinkedList 不完全是节点的链表。它是一个 java.util.List，其支持实现是节点的双向链表。那个 List 契约是它不公开对支持实现的引用的原因——所以像 "remove this node, as a node, and move it to the head" 这样的操作是不好的。这种封装是为了您自己的保护（与并发 mod 异常相同）——它允许您的代码依赖于 LinkedList 的列表语义（例如，可迭代性），而不必担心两个立方体之外的一些小丑被黑客攻击毁了它的内部并破坏了合同。

3) 你真正需要的不是链表。您需要的是一个堆栈，它允许您将任意条目移动到头部并转储尾部。您是在暗示您想要快速查找任意条目以及快速删除和快速添加，并且您希望能够在任何时候找到尾巴，以防您需要删除它。

快速查找时间 == HashSomething

任意元素的快速 add/remove == LinkedSomething

最终元素的快速寻址== SomekindaList

4) 您将需要构建自己的链接结构...或使用 LinkedHashMap。

PS LinkedHashSet 是作弊，它是使用 LinkedHashMap 实现的。

Answer 2

我先解决问题3:

正如您在问题中指出的那样，LinkedList（就像所有设计良好的通用集合一样）隐藏了实现的细节，例如包含链接的节点。在您的情况下，您需要哈希映射将这些链接直接引用为映射的值。否则（例如通过第三个 class 进行间接访问）将破坏 LRU 缓存允许非常低的值访问开销的目的。但这对于标准 Java 集合是不可能的——它们不（也不应该）提供对内部结构的直接访问。

因此，逻辑结论是，是的，您需要实现自己的方式来存储缓存中项目的使用顺序。这不一定是双链表。这些传统上用于 LRU 缓存，因为最常见的操作是在访问节点时将其移动到列表的顶部。在双链表中，这是一个非常便宜的操作，只需要重新链接四个节点，没有内存分配或空闲。

问题 1 和 2：

根本原因在于您试图将迭代器用作游标。它们被设计为被创建，逐步执行一些操作，然后被销毁。即使你克服了你遇到的问题，我预计它们背后还会有更多的问题。你是在把一个方钉放在一个圆孔里。

所以我的结论是，您需要实现自己的方式来将值保存在 class 中，以跟踪访问顺序。然而，它可以非常简单：只需要三个操作：创建、获取值和从尾部删除。创建和获取值都必须将节点移动到列表的头部。不能从列表中间插入或删除。没有删除头部。没有搜索。老实说很简单。

希望这能让你入门:-)

public class <K,V> LRU_Map implements Map<K,V> {
    private class Node {
        private final V value;
        private Node previous = null;
        private Node next = null;

        public Node(V value) {
            this.value = value;
            touch();
            if (tail == null)
                tail = this;
        }

        public V getValue() {
            touch();
            return value;
        }

        private void touch() {
            if (head != this) {
                unlink();
                moveToHead();
            }
        }

        private void unlink() {
            if (tail == this)
                tail = prev;
            if (prev != null)
                prev.next = next;
            if (next != null)
                next.prev = prev;
        }

        private void moveToHead() {
            prev = null;
            next = head;
            head = this;
        }

        public void remove() {
            assert this == tail;
            assert this != head;
            assert next == null;
            if (prev != null)
                prev.next = null;
            tail = prev;
        }
    }

    private final Map<K,Node> map = new HashMap<>();
    private Node head = null;
    private Node tail = null;

    public void put(K key, V value) {
        if (map.size() >= MAX_SIZE) {
            assert tail != null;
            tail.remove();
        }
        map.put(key, new Node(value));
    }

    public V get(K key) {
        if (map.containsKey(key))
            return map.get(key).getValue();
        else
            return null;
    }

    // and so on for other Map methods
}

Answer 3

给这只猫换皮的另一种方法是实现一个非常简单的 class 扩展 LinkedList，但运行对列表内部的列表进行任何修改（例如添加、删除等）一个 "synchronized" 块。您每次都需要通过 get() 运行您的 HashMap 伪指针，但它应该可以正常工作。例如

...
private Object lock = new Object(); //semaphore

//override LinkedList's implementations...
@Override
public <T> remove(int index) { synchronized(lock) { return super.remove(index); } }
...

如果您有 Eclipse 或 IntelliJ IDEA，那么您应该能够几乎立即自动生成您需要的方法存根，并且您可以评估哪些需要锁定。

更新存储迭代器时的ConcurrentModificationException（用于LRU缓存实现）

ConcurrentModificationException when updating stored Iterator (for LRU cache implementation)

java

caching

linked-list

hashmap

lru