为什么需要多个shared_future个对象来同步数据

Question

指向数据结构的指针通过 std::promise 和 std::shared_future 与多个线程共享。从 Anthony Williams 的书“C++ concurrency in action”（第 85-86 页）看来，数据似乎只有在每个接收线程使用一个副本时才能正确同步 std::shared_future 对象而不是每个线程访问单个全局 std::shared_future.

为了说明，请考虑创建 bigdata 并将指针传递给具有只读访问权限的多个线程的线程。如果线程之间的数据同步处理不当，内存重新排序可能会导致未定义的行为（例如 worker_thread 读取不完整的数据）。

这个（不正确的？）实现使用一个单一的全局 std::shared_future:

#include <future>

struct bigdata { ... };

std::shared_future<bigdata *> global_sf;

void worker_thread()
{
    const bigdata *ptr = global_sf.get();
    ...  // ptr read-only access
}

int main()
{
    std::promise<bigdata *> pr;
    global_sf = pr.get_future().share();

    std::thread t1{worker_thread};
    std::thread t2{worker_thread};

    pr.set_value(new bigdata);
    ...
}

并且在这个（正确的）实现中，每个 worker_thread 得到一份 std::shared_future:

void worker_thread(std::shared_future<bigdata *> sf)
{
    const bigdata *ptr = sf.get();
    ...
}

int main()
{
    std::promise<bigdata *> pr;
    auto sf = pr.get_future().share();

    std::thread t1{worker_thread, sf};
    std::thread t2{worker_thread, sf};

    pr.set_value(new bigdata);
    ....

我想知道为什么第一个版本不正确。

如果 std::shared_future::get() 是一个非 const 成员函数，这是有道理的，因为从多个线程访问单个 std::shared_future 本身就是一个数据竞争。但是由于这个成员函数被声明为const，并且global_sf对象与线程同步，所以从多个线程并发访问是安全的。

我的问题是，为什么只有每个 worker_thread 收到 std::shared_future 的副本才能保证正确工作？

Answer 1

您使用单个全局 shared_future 的实现完全没问题，如果有一点不寻常的话，这本书似乎有误。

[futures.shared_future] ¶2

[ Note: Member functions of shared_future do not synchronize with themselves, but they synchronize with the shared state. — end note ]

注释是非规范性的，因此以上内容多余地明确了一个已经隐含在规范性措辞中的事实。

[intro.races] ¶2

Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.

¶6

Certain library calls synchronize with other library calls performed by another thread.

[...Additional paragraphs defining happens before in terms of synchronizes with...]

¶19

Two actions are potentially concurrent if they are performed by different threads... The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other...

[res.on.data.races] ¶3

A C++ standard library function shall not directly or indirectly modify objects accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function’s non-const arguments, including this.

所以我们知道在不同线程中对 global_sf.get() 的调用可能是并发的，除非您伴随它们进行额外的同步（例如互斥锁）。但我们也知道，在不同线程中调用 global_sf.get() 不会冲突，因为它是一个 const 方法，因此禁止修改可从多个线程访问的对象，包括 *this。所以不满足数据竞争的定义（无序的，可能并发的冲突动作），程序不包含数据竞争。

无论如何，人们通常希望避免使用全局变量，但这是一个单独的问题。

请注意，如果这本书是正确的，那么它包含一个矛盾。它声称正确的代码 仍然包含一个全局 shared_future ，当它们创建本地副本时可以从多个线程访问它：

void worker_thread()
{
    auto local_sf = global_sf; // <-- unsynchronized access of global_sf here

    const bigdata *ptr = local_sf.get();
    ...
}

为什么需要多个shared_future个对象来同步数据

why are multiple shared_future objects needed to synchronize data

c++

multithreading

future

c++11