boost::function 线程池中的解除分配分段错误

boost::function deallocation segmentation fault in thread pool

我正在尝试创建一个线程池来阻塞主线程,直到它的所有子线程都完成。真实世界的用例是一个 "Controller" 进程,它生成独立的进程供用户交互。

不幸的是,当main退出时,遇到了段错误。我无法弄清楚这个分段错误的原因。

我编写了一个 Process class,它只不过是打开一个 shell 脚本(名为 waiter.sh,其中包含一个 sleep 5)和等待 pid 退出。 Process class 被初始化,然后 Wait() 方法被放置在线程池中的一个线程中。

调用~thread_pool()时出现问题。 std::queue 无法正确释放传递给它的 boost::function,即使对 Process 的引用仍然有效。

#include <sys/types.h>
#include <sys/wait.h>
#include <spawn.h>

#include <queue>
#include <boost/bind.hpp>
#include <boost/thread.hpp>

extern char **environ;

class Process {
private:
    pid_t pid;
    int status;
public:

    Process() : status(0), pid(-1) {
    }

    ~Process() {
        std::cout << "calling ~Process" << std::endl;
    }

    void Spawn(char **argv) {
        // want spawn posix and wait for th epid to return
        status = posix_spawn(&pid, "waiter.sh", NULL, NULL, argv, environ);
        if (status != 0) {
            perror("unable to spawn");
            return;
        }
    }

    void Wait() {
        std::cout << "spawned proc with " << pid << std::endl;
        waitpid(pid, &status, 0);
        //        wait(&pid);
        std::cout << "wait complete" << std::endl;
    }

};

下面是thread_poolclass。这是根据这个 question

的公认答案粗略改编的
class thread_pool {
private:
    std::queue<boost::function<void() >> tasks;
    boost::thread_group threads;
    std::size_t available;
    boost::mutex mutex;
    boost::condition_variable condition;
    bool running;
public:

thread_pool(std::size_t pool_size) : available(pool_size), running(true) {
    std::cout << "creating " << pool_size << " threads" << std::endl;
    for (std::size_t i = 0; i < available; ++i) {
        threads.create_thread(boost::bind(&thread_pool::pool_main, this));
    }
}

~thread_pool() {
    std::cout << "~thread_pool" << std::endl;
    {
        boost::unique_lock<boost::mutex> lock(mutex);
        running = false;
        condition.notify_all();
    }

    try {
        threads.join_all();
    } catch (const std::exception &) {
        // supress exceptions
    }
}

template <typename Task>
void run_task(Task task) {

    boost::unique_lock<boost::mutex> lock(mutex);
    if (0 == available) {
        return; //\todo err
    }

    --available;

    tasks.push(boost::function<void()>(task));
    condition.notify_one();
    return;
}

private:

void pool_main() {

    // wait on condition variable while the task is empty and the pool is still 
    // running
    boost::unique_lock<boost::mutex> lock(mutex);
    while (tasks.empty() && running) {
        condition.wait(lock);
    }

    // copy task locally and remove from the queue. this is
    // done within it's own scope so that the task object is destructed 
    // immediately after running the task. This is useful in the
    // event that the function contains shared_ptr arguments
    // bound via 'bind'
    {
        auto task = tasks.front();
        tasks.pop();

        lock.unlock();

        // run the task
        try {
            std::cout << "running task" << std::endl;
            task();
        } catch (const std::exception &) {
            // supress
        }
    }

    // task has finished so increment count of availabe threads
    lock.lock();
    ++available;

    }
};

这里是主要内容:

int main() {

    // input arguments are not required
    char *argv[] = {NULL};
    Process process;
    process.Spawn(argv);

    thread_pool pool(5);

    pool.run_task(boost::bind(&Process::Wait, &process));

    return 0;
}

这个输出是

creating 5 threads
~thread_pool
I am waiting... (from waiting.sh)
running task
spawned proc with 2573
running task
running task
running task
running task
wait complete
Segmentation fault (core dumped)

这里是堆栈跟踪:

Starting program: /home/jandreau/NetBeansProjects/Controller/dist/Debug/GNU-    Linux/controller 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
creating 5 threads
[New Thread 0x7ffff691d700 (LWP 2600)]
[New Thread 0x7ffff611c700 (LWP 2601)]
[New Thread 0x7ffff591b700 (LWP 2602)]
[New Thread 0x7ffff511a700 (LWP 2603)]
[New Thread 0x7ffff4919700 (LWP 2604)]
~thread_pool
running task
running task
spawned proc with 2599
[Thread 0x7ffff611c700 (LWP 2601) exited]
running task
[Thread 0x7ffff591b700 (LWP 2602) exited]
running task
[Thread 0x7ffff511a700 (LWP 2603) exited]
running task
[Thread 0x7ffff4919700 (LWP 2604) exited]
I am waiting...
wait complete
[Thread 0x7ffff691d700 (LWP 2600) exited]

Thread 1 "controller" received signal SIGSEGV, Segmentation fault.
0x000000000040f482 in boost::detail::function::basic_vtable0<void>::clear (
this=0xa393935322068, functor=...)
at /usr/include/boost/function/function_template.hpp:509
509           if (base.manager)
(gdb) where
#0  0x000000000040f482 in boost::detail::function::basic_vtable0<void>::clear (
this=0xa393935322068, functor=...)
at /usr/include/boost/function/function_template.hpp:509
#1  0x000000000040e263 in boost::function0<void>::clear (this=0x62ef50)
at /usr/include/boost/function/function_template.hpp:883
#2  0x000000000040cf20 in boost::function0<void>::~function0 (this=0x62ef50, 
__in_chrg=<optimized out>)
at /usr/include/boost/function/function_template.hpp:765
#3  0x000000000040b28e in boost::function<void ()>::~function() (
this=0x62ef50, __in_chrg=<optimized out>)
at /usr/include/boost/function/function_template.hpp:1056
#4  0x000000000041193a in std::_Destroy<boost::function<void ()> >(boost::function<void ()>*) (__pointer=0x62ef50)
at /usr/include/c++/5/bits/stl_construct.h:93
#5  0x00000000004112df in  std::_Destroy_aux<false>::__destroy<boost::function<void ()>*>(boost::function<void ()>*, boost::function<void ()>*) (
__first=0x62ef50, __last=0x62ed50)
at /usr/include/c++/5/bits/stl_construct.h:103
#6  0x0000000000410d16 in std::_Destroy<boost::function<void ()>*>(boost::function<void ()>*, boost::function<void ()>*) (__first=0x62edd0, __last=0x62ed50)
at /usr/include/c++/5/bits/stl_construct.h:126
#7  0x0000000000410608 in std::_Destroy<boost::function<void ()>*, boost::function<void ()> >(boost::function<void ()>*, boost::function<void ()>*, std::allocat---Type <return> to continue, or q <return> to quit---
or<boost::function<void ()> >&) (__first=0x62edd0, __last=0x62ed50)
at /usr/include/c++/5/bits/stl_construct.h:151
#8  0x000000000040fac5 in std::deque<boost::function<void ()>, std::allocator<boost::function<void ()> > >::_M_destroy_data_aux(std::_Deque_iterator<boost::function<void ()>, boost::function<void ()>&, boost::function<void ()>*>, std::_Deque_iterator<boost::function<void ()>, boost::function<void ()>&, boost::function<void ()>*>) (this=0x7fffffffdaf0, __first=..., __last=...)
at /usr/include/c++/5/bits/deque.tcc:845
#9  0x000000000040e6e4 in std::deque<boost::function<void ()>,  std::allocator<boost::function<void ()> > >::_M_destroy_data(std::_Deque_iterator<boost::function<void ()>, boost::function<void ()>&, boost::function<void ()>*>, std::_Deque_iterator<boost::function<void ()>, boost::function<void ()>&, boost::function<void ()>*>, std::allocator<boost::function<void ()> > const&) (
this=0x7fffffffdaf0, __first=..., __last=...)
at /usr/include/c++/5/bits/stl_deque.h:2037
#10 0x000000000040d0c8 in std::deque<boost::function<void ()>, std::allocator<boost::function<void ()> > >::~deque() (this=0x7fffffffdaf0, 
__in_chrg=<optimized out>) at /usr/include/c++/5/bits/stl_deque.h:1039
#11 0x000000000040b3ce in std::queue<boost::function<void ()>, std::deque<boost::function<void ()>, std::allocator<boost::function<void ()> > > >::~queue() (
this=0x7fffffffdaf0, __in_chrg=<optimized out>)
at /usr/include/c++/5/bits/stl_queue.h:96
#12 0x000000000040b6c0 in thread_pool::~thread_pool (this=0x7fffffffdaf0, 
---Type <return> to continue, or q <return> to quit---
__in_chrg=<optimized out>) at main.cpp:63  
#13 0x0000000000408b60 in main () at main.cpp:140

我对此感到困惑,因为 Process 尚未超出范围,我正在将 boost::function<void()> 的副本传递给线程池进行处理。

有什么想法吗?

您似乎没有为

分配内存
extern char **environ;

任何地方。虽然那不是 link 错误吗?

将其缩减为最小的复制案例会有很大帮助。这里有很多代码可能不需要重现问题。

还有,这是什么:

    // supress exceptions

如果您在加入线程时遇到异常,那么您可能没有加入所有线程,并且在未加入线程的情况下清理线程将在 main 退出后导致错误。

堆栈跟踪表明您正在销毁未正确初始化的 std::function(例如,一些随机内存位置被视为 std::function)或者您正在销毁 std::function两次。

问题是你的程序只推送到 tasks 一次,但弹出五次,因此你从一个空的双端队列中删除元素,这是未定义的行为。

如果running为假,while中的while循环终止,即使deque为空,running也可能为假。那你pop无条件。您可以考虑更正 pool_main 如下:

void pool_main() {

    // wait on condition variable
    // while the task is empty and the pool is still 
    // running
    boost::unique_lock<boost::mutex> lock(mutex);
    while (tasks.empty() && running) {
        condition.wait(lock);
    }

    // copy task locally and remove from the queue. this is
    // done within it's own scope so that the task object is destructed 
    // immediately after running the task. This is useful in the
    // event that the function contains shared_ptr arguments
    // bound via 'bind'
    if (!tasks.empty ()) {  // <--- !!!!!!!!!!!!!!!!!!!!!!!!
        auto task = tasks.front();
        tasks.pop();

        lock.unlock();

        // run the task
        try {
            std::cout << "running task" << std::endl;
            task();
        } catch (const std::exception &) {
            // supress
        }
    }

    // task has finished so increment count of availabe threads
    lock.lock();
    ++available;
};

但是,我不确定关于available的逻辑是否正确。 available 不应该在开始处理任务时递减并在完成时递增(因此仅在 pool_main 内且仅在新引入的 if 子句内更改)?