thread.join 在全局变量析构函数中调用时不会 return

thread.join does not return when called in global var destructor

使用 C++11 STL 和 VS2013 实现异步打印 class。 未能在没有死锁的情况下获得 thread.join() returns。 我正在尝试调试,最后发现这个问题可能是由 global/local class 变量声明引起的。这是详细信息,我不知道为什么会这样?

#include <iostream>
#include <string>
#include <chrono>
#include <mutex>
#include <thread>
#include <condition_variable>
#include "tbb/concurrent_queue.h"
using namespace std;

class logger
{
public:
    ~logger()
    {
        fin();
    }

    void init()
    {
        m_quit = false;
        m_thd = thread(bind(&logger::printer, this));
        //thread printer(bind(&logger::printer, this));
        //m_thd.swap(printer);
    }

    void fin()
    {
        //not needed
        //unique_lock<mutex> locker(m_mtx);
        if (m_thd.joinable())
        {
            m_quit = true;
            write("fin");
            //locker.unlock();

            m_thd.join();
        }
    }

    void write(const char *msg)
    {
        m_queue.push(msg);
        m_cond.notify_one();
    }

    void printer()
    {
        string msgstr;
        unique_lock<mutex> locker(m_mtx);
        while (1)
        {
            if (m_queue.try_pop(msgstr))
                cout << msgstr << endl;
            else if (m_quit)
                break;
            else
                m_cond.wait(locker);
        }
        cout << "printer quit" <<endl;
    }

    bool m_quit;
    mutex m_mtx;
    condition_variable m_cond;
    thread m_thd;
    tbb::concurrent_queue<string> m_queue;
};

为了更方便,我把thread.join放到了class的析构函数中,以确保m_thread可以正常退出。 我测试了整个 class 并且出现了错误。 m_thd.join() 从不 return 当 class 记录器声明为全局变量时 像这样:

logger lgg;

void main()
{
    lgg.init();
    for (int i = 0; i < 100; ++i)
    {
        char s[8];
        sprintf_s(s, 8, "%d", i);
        lgg.write(s);
    }

    //if first call lgg.fin() here, m_thd can be joined normally
    //lgg.fin();

    system("pause");
    //dead&blocked here and I observed that printer() finished successfully 
}

如果class logger 声明为局部变量,似乎一切正常。

void main()
{
    logger lgg;
    lgg.init();
    for (int i = 0; i < 100; ++i)
    {
        char s[8];
        sprintf_s(s, 8, "%d", i);
        lgg.write(s);
    }

    system("pause");
}

更新 2015/02/27

我想知道你是如何使用 m_mtx 的。正常模式是两个线程都锁定它,两个线程都解锁它。但是 fin() 无法锁定它。

同样出乎意料的是m_cond.wait(m_mtx)。这将释放互斥锁,除了它没有首先被锁定!

最后,由于 m_mtx 未锁定,我不明白 m_quit = true 应该如何在 m_thd 中显示。

你遇到的一个问题是 std::condition_variable::notify_one 被调用时等待线程持有的相同 std::mutex 被持有(当 logger::write 被 [=13= 调用时发生]).

这会导致通知线程立即再次阻塞,因此打印机线程可能会在销毁时无限期阻塞(或直到虚假唤醒)。

当持有与等待线程相同的互斥锁时,您不应该发出通知。

引用 来自 en.cppreference.com:

The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock.

构建和销毁全局变量和静态变量 just prior or post to DllMain getting called respectively for DLL_PROCESS_ATTACH and DLL_PROCESS_DETACH. The problem with this is that it occurs inside the loader lock. Which is the most dangerous place on the planet to be if dealing with kernel objects 因为它可能会导致死锁或应用程序随机崩溃。因此,您永远不应该在 windows EVER 上将线程原语用作静态变量。因此,在全局对象的析构函数中处理线程基本上就是在做我们在 DllMain.

中被警告不要做的事情。

quote Raymond Chen

The building is being demolished. Don't bother sweeping the floor and emptying the trash cans and erasing the whiteboards. And don't line up at the exit to the building so everybody can move their in/out magnet to out. All you're doing is making the demolition team wait for you to finish these pointless housecleaning tasks.

and again:

If your DllMain function creates a thread and then waits for the thread to do something (e.g., waits for the thread to signal an event that says that it has finished initializing, then you've created a deadlock. The DLL_PROCESS_ATTACH notification handler inside DllMain is waiting for the new thread to run, but the new thread can't run until the DllMain function returns so that it can send a new DLL_THREAD_ATTACH notification.

This deadlock is much more commonly seen in DLL_PROCESS_DETACH, where a DLL wants to shut down its worker threads and wait for them to clean up before it unloads itself. You can't wait for a thread inside DLL_PROCESS_DETACH because that thread needs to send out the DLL_THREAD_DETACH notifications before it exits, which it can't do until your DLL_PROCESS_DETACH handler returns.

即使在使用 EXE 时也会发生这种情况,因为 visual C++ 运行time 作弊并使用 C 运行time 注册其构造函数和析构函数 运行 当 运行时间被加载或卸载,因此以同样的问题结束:

The answer is that the C runtime library hires a lackey. The hired lackey is the C runtime library DLL (for example, MSVCR80.DLL). The C runtime startup code in the EXE registers all the destructors with the C runtime library DLL, and when the C runtime library DLL gets its DLL_PROCESS_DETACH, it calls all the destructors requested by the EXE.