gcc, __atomic_exchange 似乎产生非原子汇编，为什么？

Question

我正在开发一个不错的工具，它需要 atomic 交换两个不同的 64 位值。在 amd64 架构上，可以使用 XCHGQ 指令（参见 doc 中的此处，警告：这是一个很长的 pdf）。

相应地，gcc 有一些原子内置函数，理想情况下它们可以做同样的事情，因为它是可见的，例如 here。

使用这 2 个文档，我生成了以下简单的 C 函数，用于两个 64 位值的原子交换：

void theExchange(u64* a, u64* b) {
  __atomic_exchange(a, b, b, __ATOMIC_SEQ_CST);
};

（顺便说一句，我不是很清楚，为什么需要 "atomic exchange" 3 个操作数。）

对我来说有点可疑，gcc __atomic_exchange 宏使用了 3 个操作数，所以我测试了它的 asm 输出。我用 gcc -O6 -masm=intel -S 编译了它，得到了以下输出：

.LHOTB0:
        .p2align 4,,15
        .globl  theExchange
        .type   theExchange, @function
theExchange:
.LFB16:
        .cfi_startproc
        mov     rax, QWORD PTR [rsi]
        xchg    rax, QWORD PTR [rdi] /* WTF? */
        mov     QWORD PTR [rsi], rax
        ret
        .cfi_endproc
.LFE16:
        .size   theExchange, .-theExchange
        .section        .text.unlikely

正如我们所见，结果函数不仅包含单个数据移动，还包含三个不同的数据移动。因此，正如我对这段 asm 代码的理解，这个函数并不是真正的原子函数。

怎么可能？也许我误解了一些文档？我承认，gcc 内置文档对我来说不是很清楚。

Answer 1

这是 __atomic_exchange_n (type *ptr, type val, int memorder) 的通用版本，其中只有 ptr 上的交换操作是原子的，val 的读取不是。在通用版本中，val 是通过指针访问的，但原子性仍然不适用于它。指针是这样的，当编译器必须调用外部帮助程序时，它可以处理多种大小：

The four non-arithmetic functions (load, store, exchange, and compare_exchange) all have a generic version as well. This generic version works on any data type. It uses the lock-free built-in function if the specific data type size makes that possible; otherwise, an external call is left to be resolved at run time. This external call is the same format with the addition of a ‘size_t’ parameter inserted as the first parameter indicating the size of the object being pointed to. All objects must be the same size.

gcc, __atomic_exchange 似乎产生非原子汇编，为什么？

gcc, __atomic_exchange seems to produce non-atomic asm, why?

assembly

gcc

atomic

gnu-assembler

reentrancy