如何将英特尔 TSX 与 C++ 内存模型一起使用?

How to use Intel TSX with C++ memory model?

我认为 C++ 还没有涵盖任何类型的事务内存,但 TSX 仍然可以以某种方式适应使用“as if 规则”到受 C++ 内存模型控制的事物中。

那么,成功的 HLE 操作或成功的 RTM 交易会发生什么?

说 "there is data race, but it is ok" 没有太大帮助,因为它没有阐明 "ok" 的意思。

使用 HLE 可能可以将其视为 "previous operation happens before subsequent operation. As if the section was still guarded by the lock that was elided"。

RTM 是什么?由于甚至没有省略的锁,只有(可能是非原子的)内存操作,可以是加载、存储、两者或空操作。什么与什么同步?什么之前发生了什么?

显然在进入规格或询问之前我应该​​彻底阅读 "overview" 页:

Hardware Lock Elision Overview

The hardware ensures program order of operations on the lock, even though the eliding processor did not perform external write operations to the lock. If the eliding processor itself reads the value of the lock in the critical section, it will appear as if the processor had acquired the lock (the read will return the non-elided value). This behavior makes an HLE execution functionally equivalent to an execution without the HLE prefixes.

Restricted Transactional Memory Overview

RTM Memory Ordering

A successful RTM commit causes all memory operations in the RTM region to appear to execute atomically. A successfully committed RTM region consisting of an XBEGIN followed by an XEND, even with no memory operations in the RTM region, has the same ordering semantics as a LOCK prefixed instruction. The XBEGIN instruction does not have fencing semantics. However, if an RTM execution aborts, all memory updates from within the RTM region are discarded and never made visible to any other logical processor.

完成答案:

LOCK 前缀指令映射到 C++ std::memory_order::seq_cst。这涵盖了所有成功的 RTM 交易(就像单个 LOCK 前缀指令)。它还涵盖了大多数 HLE 案例。具体来说:

  • LOCK前缀指令被执行就好像被执行一样,这意味着seq_cst也是
  • XACQUIRE XCHG/XRELEASE XCHG也一样,好像执行了,这也暗示了seq_cst
  • 最后,XRELEASE MOV [mem], op 就好像 MOV [mem], op,所以它只是 release(在 C++ 内存模型的通常实现下,顺序一致的存储有内存栅栏,而不是加载)

(文档链接适用于英特尔编译器。但是它们记录了硬件行为,因此这些信息应该适用于其他编译器。编译器可能引入的唯一变量是编译时重新排序。但是我希望如果编译器实现了内在,它还实现了适当的重新排序禁止,如果仍然不确定,请放置编译器障碍。并且直接汇编应该相应地标记汇编代码)