将 std::memory_order_acq_rel 与一个原子变量一起用于 add/sub/inc/dec 就足够了吗?

Is sufficient to use std::memory_order_acq_rel with one atomic var for add/sub/inc/dec?

众所周知,使用 Release-Acquire 排序就足够了 (std::memory_order_acq_rel) when we use only one atomic variable to store or load it: https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

但是,其他基本的无等待函数是否也是如此,例如:加法、减法、递增和递减?

即对于弱 (arm-cpu, ...) 和强 (x86-cpu, ...) 内存模型,以下 C++ 代码中的 next() 函数是否线程安全,还是需要另一个障碍排序(更低/更高)?

#include <iostream>
#include <atomic>
using namespace std;

class progression_lf {
 public:
 progression_lf() : n(0) {}

 int next() {
    // memory_order_acq_rel - enough, and increases performance for the weak memory models: arm, ...
    int const current_n = n.fetch_add(1, std::memory_order_acq_rel);
    int result = 2 + (current_n - 1)*3;
    return result;
 }

 bool is_lock_free() { return ATOMIC_INT_LOCK_FREE; }

 private:
 std::atomic<int> n;
};

int main() {

    // reference (single thread)
    for(int n = 0; n < 10; ++n) {
        std::cout << (2+(n-1)*3) << ", ";
    }
    std::cout << std::endl;

    // wait-free (multi-thread safety)
    progression_lf p;
    for(int n = 0; n < 10; ++n) {
        std::cout << (p.next()) << ", ";
    }
    std::cout << std::endl; 

    std::cout << "lock-free & wait-free: " << 
        std::boolalpha << p.is_lock_free() << 
        std::endl;

    return 0;
}

如果你的线程只需要一个唯一的数字,恐怕你不需要任何比 relaxed 更强的 C++ 内存排序。原子性就足够了,std::memory_order_relaxed 保证:

Relaxed operation: there are no synchronization or ordering constraints, only atomicity is required of this operation.

尽管实际上,具有原子读取-修改-写入操作的代码仍会在 x86 上生成硬件指令,这意味着完整的内存屏障。

您可以看到不同的编译器为不同的平台生成了什么here