内存屏障与 CAS

Question

我发现CAS会把所有的CPU write cache flush到主存。这和memory barrier类似吗？

如果这是真的，这是否意味着 CAS 可以让 java 在工作之前发生？

求回答：

CAS 是CPU 指令。 barrier是StoreLoad barrier，因为我关心的是CAS之后数据会不会先写入CAS才能读取。

更多详情：我有这个问题是因为我正在编写一个内置于 Java 中的 fork-join。实现是这样的

{
    //initialize result container
    Objcet[] result = new Object[];
    //worker finish state count
    AtomicInteger state = new AtomicInteger(result.size);
}

//worker thread i
{
  result[i] = new Object();
  //this is a CAS operation
  state.getAndDecrement(); 

  if(state.get() == 0){
     //do something useing result array
  }
}

我想知道可以（使用结果数组做某事）部分查看由其他工作线程写入的所有结果元素。

Answer 1

I find that CAS will flush all cpu write cache to main memory。 Is this similar to memory barrier？

这取决于你所说的CAS是什么意思。（具体的硬件指令？实现一些Javaclass时使用的实现策略？）
这取决于你说的是哪种内存屏障。有很多不同的种类...
CAS 指令刷新所有个脏缓存行不一定正确。这取决于特定指令集/硬件如何实现 CAS 指令。

不清楚您所说的“使 happens-before 起作用”是什么意思。当然，在某些情况下，CAS 指令会为特定的 happens-before 关系提供必要的内存一致性属性。但不一定是所有关系。这将取决于硬件如何实现 CAS 指令。

说实话，除非你真的在写一个 Java 编译器，否则你最好不要试图理解 JIT 编译器需要做什么来实现 Java 内存的复杂性模型。只需应用 happens before 规则。

更新

从您最近的更新和评论来看，您的实际问题是关于 AtomicInteger 操作的行为。

原子类型的内存语义在包 javadoc 中为 java.util.concurrent.atomic 指定如下：

The memory effects for accesses and updates of atomics generally follow the rules for volatiles, as stated in The Java Language Specification (17.4 Memory Model):

get has the memory effects of reading a volatile variable.

set has the memory effects of writing (assigning) a volatile variable.

lazySet has the memory effects of writing (assigning) a volatile variable except that it permits reorderings with subsequent (but not previous) memory actions that do not themselves impose reordering constraints with ordinary non-volatile writes. Among other usage contexts, lazySet may apply when nulling out, for the sake of garbage collection, a reference that is never accessed again.

weakCompareAndSet atomically reads and conditionally writes a variable but does not create any happens-before orderings, so provides no guarantees with respect to previous or subsequent reads and writes of any variables other than the target of the weakCompareAndSet.

compareAndSet and all other read-and-update operations such as getAndIncrement have the memory effects of both reading and writing volatile variables.

如您所见，对 Atomic 类型的操作被指定为具有等效于 volatile 变量的内存语义。这应该足以推断出您使用 Java 原子类型......而无需求助于 CAS 指令和内存屏障的可疑类比。

您的示例不完整，很难理解它试图做什么。因此，我无法评论它的正确性。但是，你应该可以自己用happens-before等逻辑来分析

Answer 2

I find that CAS will flush all CPU write cache to main memory。 Is this similar to memory barrier？

X86 上 Java 中的 CAS 是使用锁前缀实现的，然后它取决于 CAS 的类型实际使用了什么样的指令；但这与本次讨论无关。锁定指令实际上是一个完整的障碍；所以它包括所有 4 个栅栏：LoadLoad/LoadStore/StoreLoad/StoreStore。由于X86由于TSO的缘故，除了StoreLoad外，其他都提供了，所以只需要添加StoreLoad即可；就像易失性写入一样。

StoreLoad 不会强制将更改写入主内存；它只会强制 CPU 等待执行加载，直到存储缓冲区被耗尽到 L1d。但是，使用基于 MESI（英特尔）的缓存一致性协议，可能会发生 cache-line 在不同的 CPU 上处于修改状态，需要先刷新到主内存，然后才能作为 EXCLUSIVE 返回.使用基于 MOESI (AMD) 的高速缓存一致性协议，这不是问题。如果 cache-line 在执行 StoreLoad 的核心上已经处于 MODIFIED,EXCLUSIVE 状态，StoreLoad 不会导致缓存行被刷新到主内存。缓存是真实的来源。

If this is true, does this mean CAS can make java Happens-Before work?

从内存模型的角度来看，java 中成功的 CAS 只不过是易失性读取，然后是易失性写入。因此，在某个对象实例上某个字段的易失性写入与同一对象实例上同一字段上的后续易失性读取之间存在先发生关系。

由于您正在使用 Java，我将重点关注 Java 内存模型，而不是过多地关注它在硬件中的实现方式。 JMM 允许执行无法纯粹通过围栏思考来解释的执行。

关于你的例子：

result[i] = new Object();
//this is a CAS operation
state.getAndDecrement(); 

if(state.get() == 0){
   //do something using result array
}

我不确定预期的逻辑是什么。在您的示例中，多个线程同时可以看到状态为 0，因此所有线程都可以开始对数组执行某些操作。如果这种行为是不可取的，那么这是由竞争条件引起的。我会使用这样的东西：

result[i] = new Object();
//this is a CAS operation
int s = state.getAndDecrement(); 

if(s == 0){
   //do something using result array

}

现在另一个问题是数组内容是否存在数据竞争。写入数组内容和写入'state'之间有一个happens-before边（程序顺序规则）。状态的写和读之间有一个happens before edge（volatile变量规则），状态的读和数组内容的读之间有一个happens before的关系（程序顺序规则）。因此，由于 happens-before 关系的传递性质，在此特定示例中，写入数组和读取其内容之间会发生边缘。

就我个人而言，我不会尝试太聪明，也不会使用像 AtomicReferenceArray 这样不太容易出现数组的东西；那么至少你不需要担心在数组写入和读取之间的边缘之前发生丢失。

内存屏障与 CAS

Memory Barrier Vs CAS

java

cpu