MemoryBarrier 真的保证刷新值吗？

Question

Albahari 在他的精彩著作《C# 简而言之》（在线提供免费章节）中谈到了内存屏障如何让我们获得 "refresh" 值。他的例子是：

    static void Main()
    {
        bool complete = false;
        var t = new Thread(() =>
        {
            bool toggle = false;
            while (!complete)
            {
                toggle = !toggle;
            }
        });
        t.Start();
        Thread.Sleep(1000);
        complete = true;
        t.Join();        // Blocks indefinitely
    }

如果您在发布模式下构建，这将按照他的建议无限期地阻塞。他提供了很少的解决方案来解决这个问题。在 while 循环中使用 Thread.MemoryBarrier，使用 lock 或使 "complete" volatile 静态字段。

我同意 volatile 字段解决方案，因为 volatile 强制执行直接内存读取而不是 JIT 的寄存器读取。但是我相信这种优化与栅栏和内存屏障无关。这只是 JIT 优化的问题，就像 JIT 更喜欢从内存或寄存器中读取它一样。实际上，不是使用 MemoryBarrier，任何方法调用 "convinces" JIT 根本不使用寄存器，如：

    class Program
    {
        [MethodImpl( MethodImplOptions.NoInlining)]
        public static bool Toggle(bool toggle)
        {
            return !toggle;
        }
        static void Main()
        {
            bool complete = false;
            var t = new Thread(() =>
            {
                bool toggle = false;
                while (!complete)
                {
                    toggle = Toggle(toggle);
                }
            });
            t.Start();
            Thread.Sleep(1000);
            complete = true;
            t.Join();        // Blocks indefinitely
        }
    }

我在这里进行虚拟切换调用。从生成的汇编代码中我可以清楚地看到 JIT 使用直接内存访问来读取 "complete" 局部变量。因此，我的假设是，至少在英特尔 CPU 上并考虑到编译器优化，MemoryBarrier 在 "refreshness" 方面没有任何作用。 MemoryBarrier 只是需要一个完整的栅栏来保存顺序，仅此而已。我这样想对吗？

Answer 1

I would agree with the volatile field solution as volatile enforces a direct memory read rather than a register read for JIT. However I believe this optimization has nothing to do with fences and memory barriers.

易失性读取和写入在 ECMA-335、I.12.6.7 中进行了描述。本节重要部分：

A volatile read has “acquire semantics” meaning that the read is guaranteed to occur prior to any references to memory that occur after the read instruction in the CIL instruction sequence. A volatile write has “release semantics” meaning that the write is guaranteed to happen after any memory references prior to the write instruction in the CIL instruction sequence.

A conforming implementation of the CLI shall guarantee this semantics of volatile operations.

和

An optimizing compiler that converts CIL to native code shall not remove any volatile operation, nor shall it coalesce multiple volatile operations into a single operation.

x86 和 x86-64 架构的获取和释放语义不需要任何内存屏障（因为硬件内存模型并不弱于易失性语义的要求）。但是对于 ARM 架构，JIT 必须发出半栅栏（单向内存屏障）。

因此，在那个使用 volatile 的示例中，由于优化限制，一切正常。对于 MemoryBarrier，它之所以有效，是因为编译器无法将该变量的读取优化为循环外的单个读取，因为该读取无法跨越 MemoryBarrier。

但是代码

while (!complete)
{
    toggle = Toggle(toggle);
}

可以优化成这样：

var tmp = complete;
while (!tmp)
{
    toggle = Toggle(toggle);
}

在方法调用的情况下它不会发生的原因是由于某种原因未应用优化（但可以应用）。因此，此代码是脆弱的且特定于实现的，因为它不依赖于标准，而是依赖于可能会更改的实现细节。

MemoryBarrier 真的保证刷新值吗？

Does MemoryBarrier really ensure refresh values?

c#

multithreading

volatile

cpu-cache