std::shared_ptr 的 use_count() 周围的完整内存屏障是否会使其成为可靠的计数器？

Question

我正在实现一个线程安全的 "lazy-synchronized" 集作为由 shared_ptr 连接的节点的链表。算法来自"The Art of Multiprocessor Programming"。我正在添加一个需要与现有函数线性化的 is_empty() 函数：contains(), add(), remove()。在下面的代码中，您可以看到 remove 是一个两步过程。首先它"lazy"通过设置marked = nullptr标记节点，然后它物理移动链表next指针。

已修改类以支持 is_empty()

template <class T>
class LazySet : public Set<T> {
    public:
      LazySet ();
      bool contains (const T&) const;
      bool is_empty ()         const;
      bool add      (const T&);
      bool remove   (const T&);
    private:
      bool validate(const std::shared_ptr<Node>&, const std::shared_ptr<Node>&);
      class Node;
      std::shared_ptr<Node> head;
      std::shared_ptr<bool> counter; //note: type is unimportant, will never change true/fase
};

template <class T>
class LazySet<T>::Node {
    public:
      Node ();
      Node (const T&);
      T key;
      std::shared_ptr<bool> marked; //assume initialized to = LazySet.counter
                                    // nullptr means it's marked; otherwise unmarked
      std::shared_ptr<Node> next;
      std::mutex mtx;
};

相关修改方法支持is_empty

template <class T>
bool LazySet<T>::remove(const T& k) {
    std::shared_ptr<Node> pred;
    std::shared_ptr<Node> curr;
    while (true) {
        pred = head;
        curr = atomic_load(&(head->next));
        //Find window where key should be in sorted list
        while ((curr) && (curr->key < k)) {
            pred = atomic_load(&curr);
            curr = atomic_load(&(curr->next));
        }
        //Aquire locks on the window, left to right locking prevents deadlock
        (pred->mtx).lock();
        if (curr) { //only lock if not nullptr
            (curr->mtx).lock();
        }
        //Ensure window didn't change before locking, and then remove
        if (validate(pred, curr)) {
            if (!curr) { //key doesn't exist, do nothing
                //## unimportant ##
            } else { //key exists, remove it
                atomic_store(&(curr->marked), nullptr); //logical "lazy" remove
                atomic_store(&(pred->next), curr->next) //physically remove
                (curr->mtx).unlock();
                (pred->mtx).unlock();
                return true;
            }
        } else {
            //## unlock and loop again ##
        }
    }
}

template <class T>
bool LazySet<T>::contains(const T& k) const {
    std::shared_ptr<Node> curr;
    curr = atomic_load(&(head->next));
    //Find window where key should be in sorted list
    while ((curr) && (curr->key < k)) {
        curr = atomic_load(&(curr->next));
    }
    //Check if key exists in window
    if (curr) {
        if (curr->key == k) { //key exists, unless marked
            return (atomic_load(&(curr->marked)) != nullptr);
        } else { //doesn't exist
            return false;
        }
    } else { //doesn't exist
        return false;
    }
}

Node.marked 最初是一个普通的布尔值， LazySet.counter 不存在。使它们成为 shared_ptrs 的选择是能够以原子方式修改节点数量的计数器和节点上的延迟删除标记。在 remove() 中同时修改两者对于 is_empty() 与 contains() 可线性化是必要的。（它不能是一个单独的 bool 标记和 int 计数器，没有双宽 CAS 或其他东西。）我希望用 shared_ptr 的 use_count() 函数实现计数器，但在多线程上下文中它只是一个由于 relaxed_memory_order 的近似值。

我知道独立的围栏通常是不好的做法，而且我不太熟悉使用它们。 但是如果我像下面那样实现 is_empty，围栏是否会确保它不再是近似值，而是可靠计数器的精确值？

template <class T>
bool LazySet<T>::is_empty() const {
    // ## SOME FULL MEMORY BARRIER
    if (counter.use_count() == 1) {
        // ## SOME FULL MEMORY BARRIER
        return true
    }
    // ## SOME FULL MEMORY BARRIER
    return false
}

我问是因为 LWG Issue 2776 说：

We can't make use_count() reliable without adding substantially more fencing.

Answer 1

宽松的内存顺序不是这里的问题。 use_count 不是 "reliable"，因为在返回值时，它可能已经 更改了 。获取值本身没有数据竞争，但是没有什么可以阻止在基于该值的任何条件语句之前修改该值。

所以你不能用它做任何依赖于它的值仍然有意义的事情（除了，如果你仍然持有一个 shared_ptr 实例，那么使用计数不会去到 0）。使其可靠的唯一方法是防止它被更改。所以你需要有一个互斥锁。

而且那个互斥量必须锁定，不仅围绕 use_count 调用和使用，而且每次你分发其中一个 shared_ptr 时你都会得到 use_count 来自.

Answer 2

// ## SOME FULL MEMORY BARRIER
if (counter.use_count() == 1) {
    // ## SOME FULL MEMORY BARRIER

有了之前的 acquire fence，您可以确保您可以确保 "see" 其他线程中所有所有者的所有重置（包括赋值和销毁期间）的结果。 acquire fence 为所有后续操作提供了 acquire 语义，防止它们 "fetching values in the future"（这在任何情况下都是语义错乱，可能使所有程序正式成为 UB）。

（调用后没有有意义的围栏。）

std::shared_ptr 的 use_count() 周围的完整内存屏障是否会使其成为可靠的计数器？

Will full memory barriers around std::shared_ptr's use_count() make it a reliable counter?

c++

multithreading

reference-counting

shared-ptr

memory-barriers