class 销毁时出现分段错误,boost::timer 作为 class 的成员定期调用

Segmentation fault on class destruction with boost::timer as a member of the class with periodic invocation

我正在开发一个简单的 class,它在创建时会安排一个周期性计时器来调用其方法之一。该方法是虚拟的,因此派生的 classes 可以用他们需要的任何周期性工作来重载它。

然而,在我对此 class 的测试中,我随机遇到了分段错误并且无法弄清楚原因。这是好的和坏的输出的代码和示例:

#include <boost/thread/mutex.hpp>
#include <boost/thread/lock_guard.hpp>
#include <boost/asio/steady_timer.hpp>
#include <boost/chrono.hpp>
#include <boost/enable_shared_from_this.hpp>
#include <boost/function.hpp>
#include <boost/atomic.hpp>
#include <boost/make_shared.hpp>
#include <boost/bind.hpp>

//******************************************************************************
class PeriodicImpl;
class Periodic {
public:
    Periodic(boost::asio::io_service& io, unsigned int periodMs);
    ~Periodic();

    virtual unsigned int periodicInvocation() = 0;

private:
    boost::shared_ptr<PeriodicImpl> pimpl_;
};

//******************************************************************************
class PeriodicImpl : public boost::enable_shared_from_this<PeriodicImpl> 
{
public:
    PeriodicImpl(boost::asio::io_service& io, unsigned int periodMs,
        boost::function<unsigned int(void)> workFunc);
    ~PeriodicImpl();

    void setupTimer(unsigned int intervalMs);

    boost::atomic<bool> isRunning_;
    unsigned int periodMs_;
    boost::asio::io_service& io_;
    boost::function<unsigned int(void)> workFunc_;
    boost::asio::steady_timer timer_;
};

//******************************************************************************
Periodic::Periodic(boost::asio::io_service& io, unsigned int periodMs):
pimpl_(boost::make_shared<PeriodicImpl>(io, periodMs, boost::bind(&Periodic::periodicInvocation, this)))
{
    std::cout << "periodic ctor " << pimpl_.use_count() << std::endl;
    pimpl_->setupTimer(periodMs);
}

Periodic::~Periodic()
{
    std::cout << "periodic dtor " << pimpl_.use_count() << std::endl;
    pimpl_->isRunning_ = false;
    pimpl_->timer_.cancel();
    std::cout << "periodic dtor end " << pimpl_.use_count() << std::endl;
}

//******************************************************************************
PeriodicImpl::PeriodicImpl(boost::asio::io_service& io, unsigned int periodMs,
    boost::function<unsigned int(void)> workFunc):
isRunning_(true), 
io_(io), periodMs_(periodMs), workFunc_(workFunc), timer_(io_)
{
}

PeriodicImpl::~PeriodicImpl()
{
    std::cout << "periodic impl dtor" << std::endl;
}

void
PeriodicImpl::setupTimer(unsigned int intervalMs)
{
    std::cout << "schedule new " << intervalMs << std::endl;
    boost::shared_ptr<PeriodicImpl> self(shared_from_this());

    timer_.expires_from_now(boost::chrono::milliseconds(intervalMs));
    timer_.async_wait([self, this](const boost::system::error_code& e){
        std::cout << "hello invoke" << std::endl;
        if (!e)
        {
            if (isRunning_)
            {
                std::cout << "invoking" << std::endl;
                unsigned int nextIntervalMs = workFunc_();
                if (nextIntervalMs)
                    setupTimer(nextIntervalMs);
            }
            else
                std::cout << "invoke not running" << std::endl;
        }
        else
            std::cout << "invoke cancel" << std::endl;
    });

    std::cout << "scheduled " << self.use_count() << std::endl;
}

//******************************************************************************
class PeriodicTest : public Periodic
{
public:
    PeriodicTest(boost::asio::io_service& io, unsigned int periodMs):
        Periodic(io, periodMs), periodMs_(periodMs), workCounter_(0){}
    ~PeriodicTest(){
        std::cout << "periodic test dtor" << std::endl;
    }

    unsigned int periodicInvocation() {
        std::cout << "invocation " << workCounter_ << std::endl;
        workCounter_++;
        return periodMs_;
    }

    unsigned int periodMs_;
    unsigned int workCounter_;
};

//******************************************************************************
void main()
{
    boost::asio::io_service io;
    boost::shared_ptr<boost::asio::io_service::work> work(new boost::asio::io_service::work(io));
    boost::thread t([&io](){
        io.run();
    });
    unsigned int workCounter = 0;

    {
        PeriodicTest p(io, 50);
        boost::this_thread::sleep_for(boost::chrono::milliseconds(550));
        workCounter = p.workCounter_;
    }
    work.reset();
    //EXPECT_EQ(10, workCounter);
}

好的输出:

hello invoke
invoking
invocation 9
schedule new 50
scheduled 5
periodic test dtor
periodic dtor 2
periodic dtor end 2
hello invoke
invoke cancel
periodic impl dtor

错误输出:

hello invoke
invoking
invocation 9
schedule new 50
scheduled 5
periodic test dtor
periodic dtor 2
periodic dtor end 2
periodic impl dtor
Segmentation fault: 11

显然,发生分段错误是因为 PeriodicImpl 被破坏,因此它的计时器 timer_。但是计时器仍然被安排 - 这导致 SEGFAULT。我不明白为什么在这种情况下调用 PeriodicImpl 析构函数,因为 shared_ptrPeriodicImplsetupTimer 调用期间被复制到作为计时器处理函数传递的 lambda 而这应该保留 PeriodicImpl 的副本并防止析构函数调用。

有什么想法吗?

我运行你的program.Regreattably,编译失败。 我添加到你的程序中,修改如下代码:

timer_.expires_from_now(boost::chrono::milliseconds(intervalMs));

修改:

timer_.expires_from_now(std::chrono::milliseconds(intervalMs));

所以,我得到的结果与你的 "Good output" 相同,但结果与你的 "Bad output" 不同。

原来问题完全不在被质疑的代码中,而是在测试它的代码中。

我通过 运行 ulimit -c unlimited 启用保存核心转储文件,然后使用 lldb 读取它:

$ lldb bin/tests/test-segment-controller -c /cores/core.75876
(lldb) bt all
* thread #1: tid = 0x0000, 0x00007fff8eb800f9 libsystem_malloc.    dylib`szone_malloc_should_clear + 2642, stop reason = signal SIGSTOP
  * frame #0: 0x00007fff8eb800f9 libsystem_malloc.dylib`szone_malloc_should_clear     + 2642
    frame #1: 0x00007fff8eb7f667 libsystem_malloc.dylib`malloc_zone_malloc + 71
    frame #2: 0x00007fff8eb7e187 libsystem_malloc.dylib`malloc + 42
    frame #3: 0x00007fff9569923e libc++abi.dylib`operator new(unsigned long) + 30
    frame #4: 0x000000010da4b516 test-periodic`testing::Message::Message(    this=0x00007fff521e8450) + 38 at gtest.cc:946
    frame #5: 0x000000010da4a645 test-periodic`testing::Message::Message(    this=0x00007fff521e8450) + 21 at gtest.cc:946
    frame #6: 0x000000010da6c027 test-periodic`std::string     testing::internal::StreamableToString<long long>(streamable=0x00007fff521e84b0)     + 39 at gtest-message.h:244
    frame #7: 0x000000010da558e8 test-    periodic`testing::internal::PrettyUnitTestResultPrinter::OnTestEnd(    this=0x00007fe733421570, test_info=0x00007fe7334211c0) + 216 at gtest.cc:3141
    frame #8: 0x000000010da56a28 test-    periodic`testing::internal::TestEventRepeater::OnTestEnd(    this=0x00007fe733421520, parameter=0x00007fe7334211c0) + 136 at gtest.cc:3321
    frame #9: 0x000000010da53957 test-periodic`testing::TestInfo::Run(    this=0x00007fe7334211c0) + 343 at gtest.cc:2667
    frame #10: 0x000000010da540c7 test-periodic`testing::TestCase::Run(    this=0x00007fe733421660) + 231 at gtest.cc:2774
    frame #11: 0x000000010da5b5d6 test-    periodic`testing::internal::UnitTestImpl::RunAllTests(this=0x00007fe733421310)     + 726 at gtest.cc:4649
    frame #12: 0x000000010da83263 test-periodic`bool     testing::internal::HandleSehExceptionsInMethodIfSupported<    testing::internal::UnitTestImpl, bool>(object=0x00007fe733421310,     method=0x000000010da5b300, location="auxiliary test code (environments or     event listeners)")(), char const*) + 131 at gtest.cc:2402
    frame #13: 0x000000010da6cde1 test-periodic`bool     testing::internal::HandleExceptionsInMethodIfSupported<    testing::internal::UnitTestImpl, bool>(object=0x00007fe733421310,     method=0x000000010da5b300, location="auxiliary test code (environments or     event listeners)")(), char const*) + 113 at gtest.cc:2438
    frame #14: 0x000000010da5b2a2 test-periodic`testing::UnitTest::Run(    this=0x000000010dab18e8) + 210 at gtest.cc:4257
    frame #15: 0x000000010da19541 test-periodic`RUN_ALL_TESTS() + 17 at gtest.    h:2233
    frame #16: 0x000000010da1818b test-periodic`main(argc=1,     argv=0x00007fff521e88b8) + 43 at test-periodic.cc:57
    frame #17: 0x00007fff9557b5c9 libdyld.dylib`start + 1
    frame #18: 0x00007fff9557b5c9 libdyld.dylib`start + 1

  thread #2: tid = 0x0001, 0x00007fff8ab404cd libsystem_pthread.    dylib`_pthread_mutex_lock + 23, stop reason = signal SIGSTOP
    frame #0: 0x00007fff8ab404cd libsystem_pthread.dylib`_pthread_mutex_lock + 23
    frame #1: 0x000000010da1c8d5 test-    periodic`boost::asio::detail::posix_mutex::lock(this=0x0000000000000030) + 21     at posix_mutex.hpp:52
    frame #2: 0x000000010da1c883 test-periodic`boost::asio::detail::scoped_lock<    boost::asio::detail::posix_mutex>::scoped_lock(this=0x000000010e4fac38,     m=0x0000000000000030) + 51 at scoped_lock.hpp:46
    frame #3: 0x000000010da1c79d test-periodic`boost::asio::detail::scoped_lock<    boost::asio::detail::posix_mutex>::scoped_lock(this=0x000000010e4fac38,     m=0x0000000000000030) + 29 at scoped_lock.hpp:45
    frame #4: 0x000000010da227a7 test-    periodic`boost::asio::detail::kqueue_reactor::run(this=0x0000000000000000,     block=true, ops=0x000000010e4fbda8) + 103 at kqueue_reactor.ipp:355
    frame #5: 0x000000010da2223c test-    periodic`boost::asio::detail::task_io_service::do_run_one(    this=0x00007fe733421900, lock=0x000000010e4fbd60,     this_thread=0x000000010e4fbd98, ec=0x000000010e4fbe58) + 348 at task_io_service    .ipp:368
    frame #6: 0x000000010da21e9f test-    periodic`boost::asio::detail::task_io_service::run(this=0x00007fe733421900,     ec=0x000000010e4fbe58) + 303 at task_io_service.ipp:153
    frame #7: 0x000000010da21d51 test-periodic`boost::asio::io_service::run(    this=0x00007fff521e8338) + 49 at io_service.ipp:59
    frame #8: 0x000000010da184b8 test-    periodic`TestPeriodic_TestDestructionDifferentThread_Test::TestBody(    this=0x00007fe733421e28)::$_0::operator()() const + 24 at test-periodic.cc:41
    frame #9: 0x000000010da1846c test-periodic`boost::detail::thread_data<    TestPeriodic_TestDestructionDifferentThread_Test::TestBody()::$_0>::run(    this=0x00007fe733421c10) + 28 at thread.hpp:117
    frame #10: 0x000000010da8849c test-periodic`boost::(anonymous namespace)    ::thread_proxy(param=<unavailable>) + 124 at thread.cpp:164
    frame #11: 0x00007fff8ab4305a libsystem_pthread.dylib`_pthread_body + 131
    frame #12: 0x00007fff8ab42fd7 libsystem_pthread.dylib`_pthread_start + 176
    frame #13: 0x00007fff8ab403ed libsystem_pthread.dylib`thread_start + 13

显然,线程 2 在尝试锁定已被破坏的互斥量时导致崩溃。但是,我没有使用任何互斥体,所以这一定是 io_service 内部的东西。如果 io_service 在其被销毁后仍在使用,则可能会发生这种情况。仔细观察我的 main() 函数,我注意到我创建的线程 t 悬空,即没有 join() 调用它。因此,这有时会造成 io 对象已被破坏(在 main 结束时)但线程 t 仍尝试使用它的情况。

因此,通过在 main() 函数末尾添加 t.join() 调用解决了问题:

void main()
{
    boost::asio::io_service io;
    boost::shared_ptr<boost::asio::io_service::work> work(new boost::asio::io_service::work(io));
    boost::thread t([&io](){
        io.run();
    });
    unsigned int workCounter = 0;

    {
        PeriodicTest p(io, 50);
        boost::this_thread::sleep_for(boost::chrono::milliseconds(550));
        workCounter = p.workCounter_;
    }
    work.reset();
    t.join();
    //EXPECT_EQ(10, workCounter);
}