重新排序原子读取
Re-ordering Atomic Reads
我正在研究读取两个共享原子变量的多线程算法:
std::atomic<int> a(10);
std::atomic<int> b(20);
void func(int key) {
int b_local = b;
int a_local = a;
/* Some Operations on a & b*/
}
算法的不变量是先读b
再读a
。
问题是,编译器(比如 GCC)能否重新排序指令,以便在 b
之前读取 a
?使用显式内存栅栏可以实现这一点,但我想了解的是,两个原子负载是否可以重新排序。
此外,在了解了 Herb Sutter 的演讲(http://herbsutter.com/2013/02/11/atomic-weapons-the-c-memory-model-and-modern-hardware/)中的 Acquire/Release 语义之后,我了解到顺序一致的系统可确保获取(如加载)和释放(如存储)之间的顺序.如何在两次获取之间排序(如两次加载)?
编辑:添加有关代码的更多信息:
考虑两个线程 T1 和 T2 执行:
T1:读取 b
的值,休眠
T2:更改 a
、returns
的值
T1 : 唤醒并读取 a
(new value)
的新值
现在,考虑重新排序的场景:
int a_local =a;
int b_local = b;
T1:读取 a
的值,休眠
T2:更改 a
、returns
的值
T1:对 a
.
的值变化一无所知
问题是“像 GCC 这样的编译器能否重新排序两个原子加载”
调用赋值时,__atomic_base 正在执行以下操作:
operator __pointer_type() const noexcept
{ return load(); }
_GLIBCXX_ALWAYS_INLINE __pointer_type
load(memory_order __m = memory_order_seq_cst) const noexcept
{
memory_order __b = __m & __memory_order_mask;
__glibcxx_assert(__b != memory_order_release);
__glibcxx_assert(__b != memory_order_acq_rel);
return __atomic_load_n(&_M_p, __m);
}
根据内置函数的 GCC 文档,例如 __atomic_load_n:
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
"An atomic operation can both constrain code motion and be mapped to hardware instructions for synchronization between threads (e.g., a fence). To which extent this happens is controlled by the memory orders, which are listed here in approximately ascending order of strength. The description of each memory order is only meant to roughly illustrate the effects and is not a specification; see the C++11 memory model for precise semantics.
__ATOMIC_RELAXED
Implies no inter-thread ordering constraints.
__ATOMIC_CONSUME
This is currently implemented using the stronger __ATOMIC_ACQUIRE memory order because of a deficiency in C++11's semantics for memory_order_consume.
__ATOMIC_ACQUIRE
Creates an inter-thread happens-before constraint from the release (or stronger) semantic store to this acquire load. Can prevent hoisting of code to before the operation.
__ATOMIC_RELEASE
Creates an inter-thread happens-before constraint to acquire (or stronger) semantic loads that read from this release store. Can prevent sinking of code to after the operation.
__ATOMIC_ACQ_REL
Combines the effects of both __ATOMIC_ACQUIRE and __ATOMIC_RELEASE.
__ATOMIC_SEQ_CST
Enforces total ordering with all other __ATOMIC_SEQ_CST operations. "
所以,如果我没看错的话,它确实 "constrain code motion",我读它的意思是防止重新排序。但我可能误解了文档。
是的,我认为除了一些优化之外,它还可以重新排序。
请检查以下资源:
Atomic vs. Non-Atomic Operations
如果您仍然担心这个问题,请尝试使用肯定会阻止内存重新排序的互斥锁。
是的,它们可以重新排序,因为一个订单与另一个订单没有区别,并且您没有限制强制执行任何特定订单。这些代码行之间只有一种关系:int b_local = b;
在 int a_local = a;
之前 排序但是因为你的代码中只有一个线程并且 2 行是独立的对于第 3 行代码(无论该行可能是什么),哪一行首先完成是完全无关的,因此编译器可能会毫无疑问地对其重新排序。
因此,如果您需要强制执行某些特定命令,您需要:
2+ 个执行线程
在这些线程中的两个操作之间建立发生在之前的关系。
Description 共 memory_order_acquire
:
no memory accesses in the current thread can be reordered before this load.
由于加载b
时的默认内存顺序是memory_order_seq_cst
,这是最强的,从a
读取不能在从b
读取之前重新排序。
甚至更弱的内存顺序,如下面的代码,提供相同的保证:
int b_local = b.load(std::memory_order_acquire);
int a_local = a.load(std::memory_order_relaxed);
我正在研究读取两个共享原子变量的多线程算法:
std::atomic<int> a(10);
std::atomic<int> b(20);
void func(int key) {
int b_local = b;
int a_local = a;
/* Some Operations on a & b*/
}
算法的不变量是先读b
再读a
。
问题是,编译器(比如 GCC)能否重新排序指令,以便在 b
之前读取 a
?使用显式内存栅栏可以实现这一点,但我想了解的是,两个原子负载是否可以重新排序。
此外,在了解了 Herb Sutter 的演讲(http://herbsutter.com/2013/02/11/atomic-weapons-the-c-memory-model-and-modern-hardware/)中的 Acquire/Release 语义之后,我了解到顺序一致的系统可确保获取(如加载)和释放(如存储)之间的顺序.如何在两次获取之间排序(如两次加载)?
编辑:添加有关代码的更多信息: 考虑两个线程 T1 和 T2 执行:
T1:读取 b
的值,休眠
T2:更改 a
、returns
T1 : 唤醒并读取 a
(new value)
现在,考虑重新排序的场景:
int a_local =a;
int b_local = b;
T1:读取 a
的值,休眠
T2:更改 a
、returns
T1:对 a
.
问题是“像 GCC 这样的编译器能否重新排序两个原子加载”
调用赋值时,__atomic_base 正在执行以下操作:
operator __pointer_type() const noexcept
{ return load(); }
_GLIBCXX_ALWAYS_INLINE __pointer_type
load(memory_order __m = memory_order_seq_cst) const noexcept
{
memory_order __b = __m & __memory_order_mask;
__glibcxx_assert(__b != memory_order_release);
__glibcxx_assert(__b != memory_order_acq_rel);
return __atomic_load_n(&_M_p, __m);
}
根据内置函数的 GCC 文档,例如 __atomic_load_n:
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
"An atomic operation can both constrain code motion and be mapped to hardware instructions for synchronization between threads (e.g., a fence). To which extent this happens is controlled by the memory orders, which are listed here in approximately ascending order of strength. The description of each memory order is only meant to roughly illustrate the effects and is not a specification; see the C++11 memory model for precise semantics.
__ATOMIC_RELAXED
Implies no inter-thread ordering constraints.
__ATOMIC_CONSUME
This is currently implemented using the stronger __ATOMIC_ACQUIRE memory order because of a deficiency in C++11's semantics for memory_order_consume.
__ATOMIC_ACQUIRE
Creates an inter-thread happens-before constraint from the release (or stronger) semantic store to this acquire load. Can prevent hoisting of code to before the operation.
__ATOMIC_RELEASE
Creates an inter-thread happens-before constraint to acquire (or stronger) semantic loads that read from this release store. Can prevent sinking of code to after the operation.
__ATOMIC_ACQ_REL
Combines the effects of both __ATOMIC_ACQUIRE and __ATOMIC_RELEASE.
__ATOMIC_SEQ_CST
Enforces total ordering with all other __ATOMIC_SEQ_CST operations. "
所以,如果我没看错的话,它确实 "constrain code motion",我读它的意思是防止重新排序。但我可能误解了文档。
是的,我认为除了一些优化之外,它还可以重新排序。 请检查以下资源: Atomic vs. Non-Atomic Operations
如果您仍然担心这个问题,请尝试使用肯定会阻止内存重新排序的互斥锁。
是的,它们可以重新排序,因为一个订单与另一个订单没有区别,并且您没有限制强制执行任何特定订单。这些代码行之间只有一种关系:int b_local = b;
在 int a_local = a;
之前 排序但是因为你的代码中只有一个线程并且 2 行是独立的对于第 3 行代码(无论该行可能是什么),哪一行首先完成是完全无关的,因此编译器可能会毫无疑问地对其重新排序。
因此,如果您需要强制执行某些特定命令,您需要:
2+ 个执行线程
在这些线程中的两个操作之间建立发生在之前的关系。
Description 共 memory_order_acquire
:
no memory accesses in the current thread can be reordered before this load.
由于加载b
时的默认内存顺序是memory_order_seq_cst
,这是最强的,从a
读取不能在从b
读取之前重新排序。
甚至更弱的内存顺序,如下面的代码,提供相同的保证:
int b_local = b.load(std::memory_order_acquire);
int a_local = a.load(std::memory_order_relaxed);