boost::shared_mutex 比 Linux 上的粗略 std::mutex 慢

Question

我有一个 std::unordered_map，它承受着来自多个线程的非常繁重的读取工作负载。我可以使用 std::mutex 进行同步，但由于并发读取应该没问题，所以我想改用 boost::shared_mutex。为了测试性能改进，我首先用一堆值预填充一个映射，然后有一堆线程运行 a read test:

for (int i = 0; i < iters; ++i) map.count(random_uint(0, key_max));

我运行这是我的 coarse-lock implementation where count is protected by std::lock_guard<std::mutex> and for my shared-lock implementation 它受 boost::shared_lock<boost::shared_mutex>.

保护

在我的 Arch Linux x86_64 系统上，GCC 6.1.1 boost::shared_lock 版本 总是比较慢！ 在我朋友的 Windows 10 系统与 MSVC 2013，boost::shared_lock 总是更快！ 完整的可编译代码在 github: https://github.com/silverhammermba/sanity

编辑

这似乎是一个特定于平台的问题。看上面。如果其他人可以构建和运行这段代码并报告他们看到的是正输出（shared_lock 更快）还是负输出（当然 mutex 更快）以及您使用的平台，我将不胜感激。

Answer 1

几点注意事项：

如果您的数据结构存在高竞争，强烈建议使用无锁实现的相同数据结构
Reader-写锁通常在读很常见但写很少见的情况下带来性能提升。从哲学上讲，如果锁必须确定其他线程是否以读模式或写模式捕获了锁，这比简单地等待锁被释放要慢。因此，如果读取很常见而写入很少，则其他线程不会被阻塞。如果写入很常见，不仅线程会被阻塞，而且它们还必须执行额外的逻辑来弄清楚如何锁定什么。

所以对于夏天来说，您的示例以错误的方式使用了锁。而且，如果您同意性能，请使用无锁编程。

Answer 2

原来boost::shared_mutex在Linux上是"suboptimal"。

The current (as of boost 1.59) implementation of boost::shared_mutex for 'pthread' is pretty suboptimal as it's using a heavyweight mutex to guard the internal mutex state... [when access concurrency is high] the shared mutex is effectively exclusive.

万岁，它偷走了我生命中的许多时光。

boost::shared_mutex 比 Linux 上的粗略 std::mutex 慢

boost::shared_mutex slower than a coarse std::mutex on Linux

c++

multithreading

boost

mutex

boost-mutex

编辑