最小化 JDK8 ConcurrentHashMap 检查和设置操作的锁定范围

Question

1.

我有多个线程更新一个 ConcurrentHashMap。每个线程根据键将一个整数列表附加到映射条目的值。任何线程都没有删除操作。

这里的重点是我想尽可能的减少锁和同步的范围

我看到 computeIf...() 方法的文档说 "Some attempted update operations on this map by other threads may be blocked while computation is in progress"，这不太令人鼓舞。另一方面，当我查看它的源代码时，我没有观察到它在整个地图上的 locks/synchronizes 位置。

因此，我想知道使用 computeIf...() 和以下自制 'method 2' 的理论性能 的比较。

2.

此外，我觉得我在这里描述的问题可能是您可以在 ConcurrentHashMap 上执行的 最简化的 check-n-set（或通常 'compound'）操作之一。

然而我不是很自信，找不到太多关于如何在 ConcurrentHashMap 上进行这种简单的复合操作的指南，没有Locking/Synchronizing 在整个地图上。

因此，我们将不胜感激任何对此的一般性良好实践建议。

public void myConcurrentHashMapTest1() {

    ConcurrentHashMap<String, List<Integer>> myMap = new ConcurrentHashMap<String, List<Integer>>();

    // MAP KEY: a Word found by a thread on a page of a book 
    String myKey = "word1";

    // -- Method 1: 
    // Step 1.1 first, try to use computeIfPresent(). doc says it may lock the
    //      entire myMap. 
    myMap.computeIfPresent(myKey, (key,val) -> val.addAll(getMyVals()));
    // Step 1.2 then use computeIfAbsent(). Again, doc says it may lock the
    //      entire myMap. 
    myMap.computeIfAbsent(myKey, key -> getMyVals());    
}

public void myConcurrentHashMapTest2() {        
    // -- Method 2: home-grown lock splitting (kind of). Will it theoretically 
    //      perform better? 

    // Step 2.1: TRY to directly put an empty list for the key
    //      This may have no effect if the key is already present in the map
    List<Integer> myEmptyList = new ArrayList<Integer>();
    myMap.putIfAbsent(myKey, myEmptyList);

    // Step 2.2: By now, we should have the key present in the map
    //      ASSUMPTION: no thread does removal 
    List<Integer> listInMap = myMap.get(myKey);

    // Step 2.3: Synchronize on that list, append all the values 
    synchronized(listInMap){
        listInMap.addAll(getMyVals());
    }

}

public List<Integer> getMyVals(){
    // MAP VALUE: e.g. Page Indices where word is found (by a thread)
    List<Integer> myValList = new ArrayList<Integer>(); 
    myValList.add(1);
    myValList.add(2);

    return myValList;
}

Answer 1

您的假设（按预期使用 ConcurrentHashMap 对您来说太慢）是基于对 Java 文档的误解。 Java文档没有说明整个地图都将被锁定。它也没有说明每个 computeIfAbsent() 操作都执行悲观锁定。

实际上可以锁定的是一个 bin (a.k.a.bucket)，它对应于 ConcurrentHashMap 的内部数组支持中的单个元素。请注意，这不是 Java 7 包含多个桶的映射段。当这样的 bin 被锁定时，可能被阻止的操作只是更新散列到同一个 bin 的密钥。

另一方面，您的解决方案并不意味着避免 ConcurrentHashMap 内的所有内部锁定 - computeIfAbsent() 只是可以降级为使用 [=15= 的方法之一] 更新时阻止。即使是 putIfAbsent() 最初用于为某个键放置空列表的 putIfAbsent()，如果它没有命中空 bin，也可能会阻塞。

但更糟糕的是，您的解决方案不能保证 synchronized 批量更新的可见性。您可以保证 get() happens-before a putIfAbsent() 它观察到的值，但是没有 happens-before在您的批量更新和随后的 get().

之间

P.S。您可以在其 OpenJDK 实现中进一步阅读 ConcurrentHashMap 中的锁定：http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/concurrent/ConcurrentHashMap.java，第 313-352 行。

Answer 2

与一样，compute… 方法通常不会锁定整个地图。在最好的情况下，即不需要增加容量并且没有散列冲突，只锁定单个键的映射。

但是，还有一些事情您可以做得更好：

通常，避免执行多次查找。这适用于两种变体，使用 computeIfPresent 后跟 computeIfAbsent，以及使用 putIfAbsent 后跟 get
仍然建议尽量减少持有锁时执行的代码，即不要在持有锁时调用getMyVals()，因为它不依赖于地图的状态

放在一起，更新应该是这样的：

// compute without holding a lock
List<Integer> toAdd=getMyVals();
// update the map
myMap.compute(myKey, (key,val) -> {
    if(val==null) val=toAdd; else val.addAll(toAdd);
    return val;
});

或

// compute without holding a lock
List<Integer> toAdd=getMyVals();
// update the map
myMap.merge(myKey, toAdd, (a,b) -> { a.addAll(b); return a; });

可以简化为

myMap.merge(myKey, getMyVals(), (a,b) -> { a.addAll(b); return a; });

最小化 JDK8 ConcurrentHashMap 检查和设置操作的锁定范围

minimizing lock scope for JDK8 ConcurrentHashMap check-and-set operation

java

multithreading

atomic

concurrenthashmap

java-8