是否保证保留对 volatile 结构的单独成员的写入顺序?
Is the order of writes to separate members of a volatile struct guaranteed to be preserved?
假设我有这样的结构:
volatile struct { int foo; int bar; } data;
data.foo = 1;
data.bar = 2;
data.foo = 3;
data.bar = 4;
所有作业都保证不会重排吗?
例如,如果没有 volatile,编译器显然可以将其优化为不同顺序的两条指令,如下所示:
data.bar = 4;
data.foo = 3;
但是对于 volatile,是否要求编译器不做这样的事情?
data.foo = 1;
data.foo = 3;
data.bar = 2;
data.bar = 4;
(将成员视为独立的不相关的易失性实体 - 并进行重新排序,我可以想象它可能会尝试改善引用的位置,以防 foo 和 bar 位于页面边界 - 例如。)
此外,对于 C 和 C++ 标准的当前版本,答案是否一致?
它们不会被重新排序。
C17 6.5.2.3(3) 说:
A postfix expression followed by the . operator and an identifier designates a member of a structure
or union object. The value is that of the named member, 97) and is an lvalue if the first expression is
an lvalue. If the first expression has qualified type, the result has the so-qualified version of the type
of the designated member.
由于 data
具有 volatile
限定类型,因此 data.bar
和 data.foo
也是如此。因此,您正在对 volatile int
对象执行两次分配。并根据 6.7.3 脚注 136,
Actions on objects so declared [as volatile
] shall not be “optimized out” by
an implementation or reordered except as permitted by the rules for evaluating expressions.
一个更微妙的问题是编译器是否可以用一条指令将它们都赋值,例如,如果它们是连续的 32 位值,它可以使用 64 位存储来设置两者吗?我认为不会,至少 GCC 和 Clang 不会尝试这样做。
如果您想在多线程中使用它,有一个重要问题。
虽然编译器不会对 volatile
变量的写入重新排序(如 中所述),但还有一点可能会发生写入重新排序,那就是 CPU 本身。这取决于 CPU 体系结构,下面是一些示例:
英特尔 64 位
参见 Intel® 64 Architecture Memory Ordering White Paper。
虽然存储说明本身没有重新排序 (2.2):
- Stores are not reordered with other stores.
它们可能以不同的顺序 (2.4) 对不同的 CPU 可见:
Intel 64 memory ordering allows stores by two processors to be seen in different orders by
those two processors
AMD 64
AMD 64(这是常见的 x64)在 the specification 中有类似的行为:
Generally, out-of-order writes are not allowed. Write instructions executed out of order cannot commit (write) their result to memory until all previous instructions have completed in program order. The processor can, however, hold the result of an out-of-order write instruction in a private buffer (not visible to software) until that result can be committed to memory.
PowerPC
我记得在 Xbox 360 which used a PowerPC CPU 上必须小心:
While the Xbox 360 CPU does not reorder instructions, it does rearrange write operations, which complete after the instructions themselves. This rearranging of writes is specifically allowed by the PowerPC memory model
要避免 CPU 以可移植的方式重新排序,您需要使用 memory fences like C++11 std::atomic_thread_fence or C11 atomic_thread_fence。没有它们,从另一个线程看到的写入顺序可能会有所不同。
另见 C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?
这在维基百科Memory barrier文章中也有注明:
Moreover, it is not guaranteed that volatile reads and writes will be seen in the same order by other processors or cores due to caching, cache coherence protocol and relaxed memory ordering, meaning volatile variables alone may not even work as inter-thread flags or mutexes.
假设我有这样的结构:
volatile struct { int foo; int bar; } data;
data.foo = 1;
data.bar = 2;
data.foo = 3;
data.bar = 4;
所有作业都保证不会重排吗?
例如,如果没有 volatile,编译器显然可以将其优化为不同顺序的两条指令,如下所示:
data.bar = 4;
data.foo = 3;
但是对于 volatile,是否要求编译器不做这样的事情?
data.foo = 1;
data.foo = 3;
data.bar = 2;
data.bar = 4;
(将成员视为独立的不相关的易失性实体 - 并进行重新排序,我可以想象它可能会尝试改善引用的位置,以防 foo 和 bar 位于页面边界 - 例如。)
此外,对于 C 和 C++ 标准的当前版本,答案是否一致?
它们不会被重新排序。
C17 6.5.2.3(3) 说:
A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member, 97) and is an lvalue if the first expression is an lvalue. If the first expression has qualified type, the result has the so-qualified version of the type of the designated member.
由于 data
具有 volatile
限定类型,因此 data.bar
和 data.foo
也是如此。因此,您正在对 volatile int
对象执行两次分配。并根据 6.7.3 脚注 136,
Actions on objects so declared [as
volatile
] shall not be “optimized out” by an implementation or reordered except as permitted by the rules for evaluating expressions.
一个更微妙的问题是编译器是否可以用一条指令将它们都赋值,例如,如果它们是连续的 32 位值,它可以使用 64 位存储来设置两者吗?我认为不会,至少 GCC 和 Clang 不会尝试这样做。
如果您想在多线程中使用它,有一个重要问题。
虽然编译器不会对 volatile
变量的写入重新排序(如
英特尔 64 位
参见 Intel® 64 Architecture Memory Ordering White Paper。
虽然存储说明本身没有重新排序 (2.2):
- Stores are not reordered with other stores.
它们可能以不同的顺序 (2.4) 对不同的 CPU 可见:
Intel 64 memory ordering allows stores by two processors to be seen in different orders by those two processors
AMD 64
AMD 64(这是常见的 x64)在 the specification 中有类似的行为:
Generally, out-of-order writes are not allowed. Write instructions executed out of order cannot commit (write) their result to memory until all previous instructions have completed in program order. The processor can, however, hold the result of an out-of-order write instruction in a private buffer (not visible to software) until that result can be committed to memory.
PowerPC
我记得在 Xbox 360 which used a PowerPC CPU 上必须小心:
While the Xbox 360 CPU does not reorder instructions, it does rearrange write operations, which complete after the instructions themselves. This rearranging of writes is specifically allowed by the PowerPC memory model
要避免 CPU 以可移植的方式重新排序,您需要使用 memory fences like C++11 std::atomic_thread_fence or C11 atomic_thread_fence。没有它们,从另一个线程看到的写入顺序可能会有所不同。
另见 C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?
这在维基百科Memory barrier文章中也有注明:
Moreover, it is not guaranteed that volatile reads and writes will be seen in the same order by other processors or cores due to caching, cache coherence protocol and relaxed memory ordering, meaning volatile variables alone may not even work as inter-thread flags or mutexes.