在无锁单链表的开头插入节点时使用的正确内存顺序是什么?
What are the correct memory orders to use when inserting a node at the beginning of a lock free singly linked list?
我有一个简单的链表。没有 ABA 问题的危险,我对阻塞类别很满意,我不在乎我的列表是 FIFO、LIFO 还是随机的。只要插入成功不让其他人失败。
该代码看起来像这样:
class Class {
std::atomic<Node*> m_list;
...
};
void Class::add(Node* node)
{
node->next = m_list.load(std::memory_order_acquire);
while (!m_list.compare_exchange_weak(node->next, node, std::memory_order_acq_rel, std::memory_order_acquire));
}
我或多或少随机填写了用过的 memory_order。
此处使用的正确内存顺序是什么?
我见过人们在所有地方都使用 std::memory_order_relaxed
,SO 上的一个人也使用过它,但是 std::memory_order_release
compare_exchange_weak 的成功案例——以及 genmc项目在类似情况下使用 memory_order_acquire / 两次 memory_order_acq_rel,但我无法让 genmc 为测试用例工作:(.
使用来自 Michalis Kokologiannakis genmc 的 excellent 工具,我能够使用以下测试代码验证所需的内存顺序。不幸的是,genmc 目前需要 C 代码,但这对于确定内存顺序当然是无关紧要的。
// Install https://github.com/MPI-SWS/genmc
//
// Then test with:
//
// genmc -unroll 5 -- genmc_sll_test.c
// These header files are replaced by genmc (see /usr/local/include/genmc):
#include <pthread.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
#include <stdatomic.h>
#include <stdio.h>
#define PRODUCER_THREADS 3
#define CONSUMER_THREADS 2
struct Node
{
struct Node* next;
};
struct Node* const deleted = (struct Node*)0xd31373d;
_Atomic(struct Node*) list;
void* producer_thread(void* node_)
{
struct Node* node = (struct Node*)node_;
// Insert node at beginning of the list.
node->next = atomic_load_explicit(&list, memory_order_relaxed);
while (!atomic_compare_exchange_weak_explicit(&list, &node->next,
node, memory_order_release, memory_order_relaxed))
;
return NULL;
}
void* consumer_thread(void* param)
{
// Replace the whole list with an empty list.
struct Node* head = atomic_exchange_explicit(&list, NULL, memory_order_acquire);
// Delete each node that was in the list.
while (head)
{
struct Node* orphan = head;
head = orphan->next;
// Mark the node as deleted.
assert(orphan->next != deleted);
orphan->next = deleted;
}
return NULL;
}
pthread_t t[PRODUCER_THREADS + CONSUMER_THREADS];
struct Node n[PRODUCER_THREADS]; // Initially filled with zeroes -->
// none of the Node's is marked as deleted.
int main()
{
// Start PRODUCER_THREADS threads that each append one node to the queue.
for (int i = 0; i < PRODUCER_THREADS; ++i)
if (pthread_create(&t[i], NULL, producer_thread, &n[i]))
abort();
// Start CONSUMER_THREAD threads that each delete all nodes that were added so far.
for (int i = 0; i < CONSUMER_THREADS; ++i)
if (pthread_create(&t[PRODUCER_THREADS + i], NULL, consumer_thread, NULL))
abort();
// Wait till all threads finished.
for (int i = 0; i < PRODUCER_THREADS + CONSUMER_THREADS; ++i)
if (pthread_join(t[i], NULL))
abort();
// Count number of elements still in the list.
struct Node* l = list;
int count = 0;
while (l)
{
++count;
l = l->next;
}
// Count the number of deleted elements.
int del_count = 0;
for (int i = 0; i < PRODUCER_THREADS; ++i)
if (n[i].next == deleted)
++del_count;
assert(count + del_count == PRODUCER_THREADS);
//printf("count = %d; deleted = %d\n", count, del_count);
return 0;
}
其输出为
$ genmc -unroll 5 -- genmc_sll_test.c
Number of complete executions explored: 6384
Total wall-clock time: 1.26s
将 memory_order_release
或 memory_order_acquire
替换为 memory_order_relaxed
会导致断言。
事实上,可以检查当仅插入节点时使用独占 memory_order_relaxed
足以将它们全部干净地放入列表中(尽管在 'random' 顺序中 - 没有任何顺序一致,因此如果由于其他原因存在此类相关性,则添加它们的顺序不一定与线程尝试添加它们的顺序相同。
然而,memory_order_release
是必需的,这样当用 memory_order_acquire
读取 head
时,我们可以确定所有非原子 next
指针在"consumer" 线程。
请注意,这里没有 ABA 问题,因为 head
和 next
的值在被 'consumer_thread' 函数删除之前不能是 "reused",这是only place where these node are allowed to deleted (therefore), 暗示只能有一个消费者线程(此测试代码不检查 ABA 问题,因此它也适用于使用 2 CONSUMER_THREADS)。
实际代码是一种垃圾回收机制,其中多个 "producer" 线程在可以删除时将指针添加到单向链表,但只有在一个特定线程中实际这样做才是安全的(在在这种情况下,只有一个 "consumer" 线程,它在主循环中的一个众所周知的地方执行垃圾收集。
我有一个简单的链表。没有 ABA 问题的危险,我对阻塞类别很满意,我不在乎我的列表是 FIFO、LIFO 还是随机的。只要插入成功不让其他人失败。
该代码看起来像这样:
class Class {
std::atomic<Node*> m_list;
...
};
void Class::add(Node* node)
{
node->next = m_list.load(std::memory_order_acquire);
while (!m_list.compare_exchange_weak(node->next, node, std::memory_order_acq_rel, std::memory_order_acquire));
}
我或多或少随机填写了用过的 memory_order。 此处使用的正确内存顺序是什么?
我见过人们在所有地方都使用 std::memory_order_relaxed
,SO 上的一个人也使用过它,但是 std::memory_order_release
compare_exchange_weak 的成功案例——以及 genmc项目在类似情况下使用 memory_order_acquire / 两次 memory_order_acq_rel,但我无法让 genmc 为测试用例工作:(.
使用来自 Michalis Kokologiannakis genmc 的 excellent 工具,我能够使用以下测试代码验证所需的内存顺序。不幸的是,genmc 目前需要 C 代码,但这对于确定内存顺序当然是无关紧要的。
// Install https://github.com/MPI-SWS/genmc
//
// Then test with:
//
// genmc -unroll 5 -- genmc_sll_test.c
// These header files are replaced by genmc (see /usr/local/include/genmc):
#include <pthread.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
#include <stdatomic.h>
#include <stdio.h>
#define PRODUCER_THREADS 3
#define CONSUMER_THREADS 2
struct Node
{
struct Node* next;
};
struct Node* const deleted = (struct Node*)0xd31373d;
_Atomic(struct Node*) list;
void* producer_thread(void* node_)
{
struct Node* node = (struct Node*)node_;
// Insert node at beginning of the list.
node->next = atomic_load_explicit(&list, memory_order_relaxed);
while (!atomic_compare_exchange_weak_explicit(&list, &node->next,
node, memory_order_release, memory_order_relaxed))
;
return NULL;
}
void* consumer_thread(void* param)
{
// Replace the whole list with an empty list.
struct Node* head = atomic_exchange_explicit(&list, NULL, memory_order_acquire);
// Delete each node that was in the list.
while (head)
{
struct Node* orphan = head;
head = orphan->next;
// Mark the node as deleted.
assert(orphan->next != deleted);
orphan->next = deleted;
}
return NULL;
}
pthread_t t[PRODUCER_THREADS + CONSUMER_THREADS];
struct Node n[PRODUCER_THREADS]; // Initially filled with zeroes -->
// none of the Node's is marked as deleted.
int main()
{
// Start PRODUCER_THREADS threads that each append one node to the queue.
for (int i = 0; i < PRODUCER_THREADS; ++i)
if (pthread_create(&t[i], NULL, producer_thread, &n[i]))
abort();
// Start CONSUMER_THREAD threads that each delete all nodes that were added so far.
for (int i = 0; i < CONSUMER_THREADS; ++i)
if (pthread_create(&t[PRODUCER_THREADS + i], NULL, consumer_thread, NULL))
abort();
// Wait till all threads finished.
for (int i = 0; i < PRODUCER_THREADS + CONSUMER_THREADS; ++i)
if (pthread_join(t[i], NULL))
abort();
// Count number of elements still in the list.
struct Node* l = list;
int count = 0;
while (l)
{
++count;
l = l->next;
}
// Count the number of deleted elements.
int del_count = 0;
for (int i = 0; i < PRODUCER_THREADS; ++i)
if (n[i].next == deleted)
++del_count;
assert(count + del_count == PRODUCER_THREADS);
//printf("count = %d; deleted = %d\n", count, del_count);
return 0;
}
其输出为
$ genmc -unroll 5 -- genmc_sll_test.c
Number of complete executions explored: 6384
Total wall-clock time: 1.26s
将 memory_order_release
或 memory_order_acquire
替换为 memory_order_relaxed
会导致断言。
事实上,可以检查当仅插入节点时使用独占 memory_order_relaxed
足以将它们全部干净地放入列表中(尽管在 'random' 顺序中 - 没有任何顺序一致,因此如果由于其他原因存在此类相关性,则添加它们的顺序不一定与线程尝试添加它们的顺序相同。
然而,memory_order_release
是必需的,这样当用 memory_order_acquire
读取 head
时,我们可以确定所有非原子 next
指针在"consumer" 线程。
请注意,这里没有 ABA 问题,因为 head
和 next
的值在被 'consumer_thread' 函数删除之前不能是 "reused",这是only place where these node are allowed to deleted (therefore), 暗示只能有一个消费者线程(此测试代码不检查 ABA 问题,因此它也适用于使用 2 CONSUMER_THREADS)。
实际代码是一种垃圾回收机制,其中多个 "producer" 线程在可以删除时将指针添加到单向链表,但只有在一个特定线程中实际这样做才是安全的(在在这种情况下,只有一个 "consumer" 线程,它在主循环中的一个众所周知的地方执行垃圾收集。