无锁链表的性能比有锁链表差

Question

我正在尝试比较锁定和无锁定链表数据结构的性能。我已经为无锁链表实现了 this 算法。这两个程序都是用 C 实现的。

我正在测试 4 个线程。每个线程有 1000 个插入操作。

我正在使用英特尔 PCM 工具来测量性能。

以下是我的发现：无锁：

Median of L3 Misses=1012699
Median of L2 Misses=1479741
Median of L3 Hits=484128
Median of L2 Hits=1797537
Median of Time=1.80696
Median of Cycles=5296042019
Median of IPC=1.536135
Median of Bytes Read=444423232
Median of Bytes Written=25414144

已锁定：

Median of L3 Misses=711796.5
Median of L2 Misses=1517899
Median of L3 Hits=819408.5
Median of L2 Hits=2282527
Median of Time=0.244517
Median of Cycles=894265192
Median of IPC=0.8495695
Median of Bytes Read=174872576
Median of Bytes Written=24722912

除了 IPC 之外，锁定版本在所有方面都表现得更好。这是应该发生的事情吗？或者 lock Free 数据结构应该表现更好？

如果是，那么使用无锁数据结构有什么好处？任何帮助将不胜感激。

Answer 1

The locked version performs better on every count except IPC. Is this what is supposed to happen? Or the lock Free data structure should perform better?

一般来说，哪个性能更好取决于工作负载细节和实施细节。你引用的论文说

Lock-free data structures also have the potential to have better performance

（强调），但它并不保证在每种情况下都有更好的性能。虽然多个线程可以同时修改无锁数据结构，但即使没有冲突，每次修改也会涉及更多的操作。当存在冲突时，性能会下降。

我还观察到您的无锁代码的缓存未命中率高于锁定代码。尽管我不能自信地解释这一点，但我至少可以想到两个看似合理的原因，说明这将是无锁实现的预期结果。自然地，低效的缓存使用会显着降低性能。

If yes, then what is the benefit of using lock free data structures?

论文说主要好处是：

If an implementation is lock-free, delays or failures of individual processes do not block the progress of other processes in the system.

无锁链表的性能比有锁链表差

Lock free linked list is performing worse than the locked counterpart

c

multithreading

profiling

linked-list

lock-free