memory_order_consume 和 memory_order_acquire 之间的区别

Difference between memory_order_consume and memory_order_acquire

我有一个关于 GCC-Wiki article 的问题。在标题“Overall Summary”下,给出了以下代码示例:

线程 1:

y.store (20);
x.store (10);

线程 2:

if (x.load() == 10) {
  assert (y.load() == 20)
  y.store (10)
}

也就是说,如果所有的store都是release,所有的load都是acquire,线程2中的assert就不会失败。这对我来说很清楚(因为线程 1 中对 x 的存储与线程 2 中 x 的加载同步)。

但是现在是我不明白的部分。也就是说,如果所有的store都是release,所有的load都是consume,结果是一样的。来自 y 的负载是否可能在来自 x 的负载之前被提升(因为这些变量之间没有依赖关系)?这意味着线程 2 中的断言实际上可能会失败。

两者都在原子存储上建立可传递的 "visibility" 订单,除非它们已发布 memory_order_relaxed。如果线程使用其中一种模式读取原子对象 x,则可以确保它看到对已知在写入 x 之前完成的所有原子对象 y 的所有修改].

"acquire" 和 "consume" 之间的区别在于对某个变量 z 的非原子写入的可见性。对于 acquire 所有 写入,无论是否为原子,都是可见的。对于 consume 只有原子的保证是可见的。

thread 1                               thread 2
z = 5 ... store(&x, 3, release) ...... load(&x, acquire) ... z == 5 // we know that z is written
z = 5 ... store(&x, 3, release) ...... load(&x, consume) ... z == ? // we may not have last value of z

C11标准的规定如下。

5.1.2.4 多线程执行和数据竞争

  1. An evaluation A is dependency-ordered before 16) an evaluation B if:

    A performs a release operation on an atomic object M, and, in another thread, B performs a consume operation on M and reads a value written by any side effect in the release sequence headed by A, or

    — for some evaluation X, A is dependency-ordered before X and X carries a dependency to B.

  2. An evaluation A inter-thread happens before an evaluation B if A synchronizes with B, A is dependency-ordered before B, or, for some evaluation X:

    — A synchronizes with X and X is sequenced before B,

    — A is sequenced before X and X inter-thread happens before B, or

    — A inter-thread happens before X and X inter-thread happens before B.

  3. NOTE 7 The ‘‘inter-thread happens before’’ relation describes arbitrary concatenations of ‘‘sequenced before’’, ‘‘synchronizes with’’, and ‘‘dependency-ordered before’’ relationships, with two exceptions. The first exception is that a concatenation is not permitted to end with ‘‘dependency-ordered before’’ followed by ‘‘sequenced before’’. The reason for this limitation is that a consume operation participating in a ‘‘dependency-ordered before’’ relationship provides ordering only with respect to operations to which this consume operation actually carries a dependency. The reason that this limitation applies only to the end of such a concatenation is that any subsequent release operation will provide the required ordering for a prior consume operation. The second exception is that a concatenation is not permitted to consist entirely of ‘‘sequenced before’’. The reasons for this limitation are (1) to permit ‘‘inter-thread happens before’’ to be transitively closed and (2) the ‘‘happens before’’ relation, defined below, provides for relationships consisting entirely of ‘‘sequenced before’’.

  4. An evaluation A happens before an evaluation B if A is sequenced before B or A inter-thread happens before B.

  5. A visible side effect A on an object M with respect to a value computation B of M satisfies the conditions:

    A happens before B, and

    — there is no other side effect X to M such that A happens before X and X happens before B.

    The value of a non-atomic scalar object M, as determined by evaluation B, shall be the value stored by the visible side effect A.

(强调)


在下面的评论中,我将缩写如下:

  • 依存关系排序之前: DOB
  • 线程间发生在: ITHB
  • 发生在: HB
  • 前序: SeqB

让我们回顾一下这是如何应用的。我们有 4 个相关的内存操作,我们将其命名为评估 A、B、C 和 D:

线程 1:

y.store (20);             //    Release; Evaluation A
x.store (10);             //    Release; Evaluation B

线程 2:

if (x.load() == 10) {     //    Consume; Evaluation C
  assert (y.load() == 20) //    Consume; Evaluation D
  y.store (10)
}

为了证明断言永远不会失败,我们实际上试图证明 A 在 D 处始终是一个可见的副作用。根据5.1.2.4(15),我们有:

A SeqB B DOB C SeqB D

这是一个以 DOB 结尾的串联,后跟 SeqB。这是 明确地 由 (17) 统治 而不是 是 ITHB 串联,尽管 (16) 说了什么。

我们知道,由于A和D不在同一个执行线程中,所以A不是SeqB D;因此(18)中HB的两个条件都不满足,A不HB D.

因此,A 对 D 不可见,因为不满足 (19) 的条件之一。断言可能会失败。


然后描述了这是如何进行的 here, in the C++ standard's memory model discussion and here, Section 4.2 Control Dependencies:

  1. (提前一段时间)线程 2 的分支预测器猜测 if 将被采用。
  2. 线程 2 接近预测采用的分支并开始推测性提取。
  3. 线程 2 乱序并推测性地从 y 加载 0xGUNK(评估 D)。 (也许它还没有从缓存中被驱逐?)。
  4. 线程 1 将 20 存储到 y(评估 A)
  5. 线程 1 将 10 存储到 x(计算 B)
  6. 线程 2 从 x(评估 C)加载 10
  7. 线程 2 确认 if 已被占用。
  8. 线程 2 的推测负载 y == 0xGUNK 已提交。
  9. 线程 2 断言失败。

允许评估 D 在 C 之前重新排序的原因是因为 consume not 禁止它。这与 acquire-load 不同,后者阻止任何 load/store after 它在程序顺序中被重新排序 before 吧。同样,5.1.2.4(15) 声明,参与“之前依赖排序”关系的消费操作仅提供关于此消费操作实际携带依赖关系的操作的排序, 两个负载之间绝对没有依赖关系。


CppMem 验证

CppMem是一个帮助探索C11和C++11内存模型下的共享数据访问场景的工具。

对于以下近似问题场景的代码:

int main() {
  atomic_int x, y;
  y.store(30, mo_seq_cst);
  {{{  { y.store(20, mo_release);
         x.store(10, mo_release); }
  ||| { r3 = x.load(mo_consume).readsvalue(10);
        r4 = y.load(mo_consume); }
  }}};
  return 0; }

该工具报告 两个 一致的无竞争场景,即:

其中y=20读取成功,

其中读取了"stale"初始化值y=30。写意圈是我的

相比之下,当 mo_acquire 用于加载时,CppMem 仅报告 一个 一致的无竞争场景,即正确的场景:

其中 y=20 被读取。