内存屏障:软件黑客的硬件视图示例 3

Memory Barriers: a Hardware View for Software Hackers Example 3

我正在从原始论文中复制该图的文本,Memory Barriers: a Hardware View for Software Hackers

Table 4 shows three code fragments, executed concurrently by CPUs 0, 1, and 2. All variables are initially zero.

Note that neither CPU 1 nor CPU 2 can proceed to line 5 until they see CPU 0’s assignment to “b” on line 3. Once CPU 1 and 2 have executed their memory barriers on line 4, they are both guaranteed to see all assignments by CPU 0 preceding its memory barrier on line 2. Similarly, CPU 0’s memory barrier on line 8 pairs with those of CPUs 1 and 2 on line 4, so that CPU 0 will not execute the assignment to “e” on line 9 until after its assignment to “a” is visible to both of the other CPUs. Therefore, CPU 2’s assertion on line 9 is guaranteed not to fire.

对我来说,CPU0 上的第 6-9 行似乎根本没有必要,因为第 2 行上的内存屏障 CPU 0 和第 4 行上的内存屏障 CPU 1&2 保证提取 b=1 的效果,以及之前的所有存储,又名 a=1。然后,断言 e == 0 || a == 1 总是成功。

不知道是不是我忽略了什么。任何澄清表示赞赏。

离开 CPU 0 中的第 6-9 行肯定会阻止 assert() 触发。再一次,假设 e 被初始化为零,删除断言以外的所有代码也是如此。然而,这两种修改都无济于事。相反,断言的关键点是问题 "Is it possible for CPU 2 to see the state e==1&&a==0 at the end of its execution?" 以这种方式看待它应该会迫使您思考什么值以什么顺序传播到哪里。

但你忽略的主要是这篇论文非常古老,从那时起在理解和形式化内存排序方面取得了巨大的进步。我正在为 Is Parallel Programming Hard, And, Is So, 添加一个新的内存排序章节 你能做些什么? 同时,这两篇 LWN 文章 here and here 可能会有帮助。

或者,如果您想查看图书的当前状态,git clone git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git