是否可以将不相关的锁定语句之后的读取指令移动到锁定之前?

Can a read instruction after an unrelated lock statement be moved before the lock?

此问题是 this 话题中评论的后续问题。

假设我们有以下代码:

// (1)
lock (padlock)
{
    // (2)
}
var value = nonVolatileField; // (3)

此外,我们假设 (2) 中的指令对 nonVolatileField 没有任何影响,反之亦然。

读取指令 (3) 是否可以重新排序,使其在锁定语句 (1) 之前或在其内部 (2) 结束?

据我所知,C# 规范 (§3.10) 和 CLI 规范 (§I.12.6.5) 中没有任何内容禁止此类重新排序。

请注意,这与 this 的问题不同。在这里我特别询问阅读说明,因为据我了解,它们不被视为副作用并且保证较弱。

相信 CLI 规范部分保证了这一点,尽管它并不像它可能的那样清楚。来自 I.12.6.5:

Acquiring a lock (System.Threading.Monitor.Enter or entering a synchronized method) shall implicitly perform a volatile read operation, and releasing a lock (System.Threading.Monitor.Exit or leaving a synchronized method) shall implicitly perform a volatile write operation. See §I.12.6.7.

然后从 I.12.6.7:

A volatile read has “acquire semantics” meaning that the read is guaranteed to occur prior to any references to memory that occur after the read instruction in the CIL instruction sequence. A volatile write has “release semantics” meaning that the write is guaranteed to happen after any memory references prior to the write instruction in the CIL instruction sequence.

所以进入锁应该可以防止(3)移动到(1)。我相信,阅读 nonVolatileField 仍然算作 "reference to memory"。但是,当锁退出时,读取仍然可以在易失性写入之前执行,因此它仍然可以移动到(2)。

C#/CLI 内存模型目前还有很多不足之处。我希望整个事情能得到明显的澄清(并且可能会加强,使一些 "theoretically valid but practically awful" 优化无效)。

就 .NET 而言,进入监视器(lock 语句)具有获取语义,因为它隐式执行易失性读取,并退出监视器([=11= 的末尾) ] 块)具有释放语义,因为它隐式执行易失性写入(请参阅 Common Language Infrastructure (CLI) Partition I 中的§12.6.5 锁和线程)。

volatile bool areWeThereYet = false;

// In thread 1
// Accesses, usually writes: create objects, initialize them
areWeThereYet = true;

// In thread 2
if (areWeThereYet)
{
    // Accesses, usually reads: use created and initialized objects
}

当您将一个值写入 areWeThereYet 时,在它之前的所有访问都已执行,并且不会在易失性写入之后重新排序。

当您从 areWeThereYet 读取时,后续访问不会重新排序到易失性读取之前。

在这种情况下,当线程 2 观察到 areWeThereYet 发生变化时,它可以保证接下来的访问(通常是读取)将观察到另一个线程的访问(通常是写入)。假设没有其他代码干扰受影响的变量。

至于.NET中的其他同步原语,如SemaphoreSlim,虽然没有明确说明,但如果它们没有类似的语义,那将是相当无用的。事实上,基于它们的程序甚至无法在内存模型较弱的平台或硬件架构中正常运行。


许多人都认为 Microsoft 应该在此类架构上强制执行强大的内存模型,类似于 x86/amd64 以保持当前代码库(Microsoft 自己的和他们客户的代码库)兼容。

我无法验证自己,因为我没有 Microsoft Windows 的 ARM 设备,更不用说 .NET Framework for ARM,但至少有一篇 MSDN 杂志文章,作者是 Andrew Pardoe,CLR - .NET Development for ARM Processors,状态:

The CLR is allowed to expose a stronger memory model than the ECMA CLI specification requires. On x86, for example, the memory model of the CLR is strong because the processor’s memory model is strong. The .NET team could’ve made the memory model on ARM as strong as the model on x86, but ensuring the perfect ordering whenever possible can have a notable impact on code execution performance. We’ve done targeted work to strengthen the memory model on ARM—specifically, we’ve inserted memory barriers at key points when writing to the managed heap to guarantee type safety—but we’ve made sure to only do this with a minimal impact on performance. The team went through multiple design reviews with experts to make sure that the techniques applied in the ARM CLR were correct. Moreover, performance benchmarks show that .NET code execution performance scales the same as native C++ code when compared across x86, x64 and ARM.