AddressSanitizer 将 std::vector<T>::push_back 标识为释放后堆使用错误的原因

AddressSanitizer identifies std::vector<T>::push_back as reason for heap-use-after-free error

我正在尝试调试一个在启动时经常崩溃的程序(它最终在几次尝试后启动)。使用 ASAN 编译后,我得到以下跟踪,表明 10 次崩溃中有 9 次是由 std::vector<T>::push_back 触发的(注意下面两条跟踪中的第 9 行和第 15 行):

==35520== ERROR: AddressSanitizer: heap-use-after-free on address 0x60520005c37f at pc 0x51cc5c bp 0x7f257ebfc050 sp 0x7f257ebfc048
READ of size 1 at 0x60520005c37f thread T8 (CELOXICA)
    #0 0x51cc5b in MappingData** std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<MappingData*>(MappingData* const*, MappingData* const*, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:372
    #1 0x51c9f5 in MappingData** std::__copy_move_a<false, MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:390
    #2 0x51c502 in MappingData** std::__copy_move_a2<false, MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:428
    #3 0x51b916 in MappingData** std::copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:460
    #4 0x519852 in MappingData** std::__uninitialized_copy<true>::__uninit_copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:93
    #5 0x5158d6 in MappingData** std::uninitialized_copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:117
    #6 0x511aeb in MappingData** std::__uninitialized_copy_a<MappingData**, MappingData**, MappingData*>(MappingData**, MappingData**, MappingData**, std::allocator<MappingData*>&) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:258
    #7 0x50df13 in MappingData** std::__uninitialized_move_if_noexcept_a<MappingData**, MappingData**, std::allocator<MappingData*> >(MappingData**, MappingData**, MappingData**, std::allocator<MappingData*>&) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:281
    #8 0x509202 in std::vector<MappingData*, std::allocator<MappingData*> >::_M_insert_aux(__gnu_cxx::__normal_iterator<MappingData**, std::vector<MappingData*, std::allocator<MappingData*> > >, MappingData* const&) /home/olumide/4.8.5/include/c++/4.8.5/bits/vector.tcc:362
    #9 0x5060ac in std::vector<MappingData*, std::allocator<MappingData*> >::push_back(MappingData* const&) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_vector.h:913
    #10 0x4f1270 in Queue::publishMappingData(MappingData*) /home/olumide/repo/source/app/src/framework/queue.cpp:149
    #11 0x7f258c449cd7 in Manager::communicationThread() (/home/fmeprod/apps/current/celoxica.so+0x2bcd7)
    #12 0x7f25995f9e82 in thread_proxy (/home/repo/boost/boost_1_56_x64/lib/libboost_thread.so.1.56.0+0x10e82)
    #13 0x7f2596248b87 in __asan::AsanThread::ThreadStart() /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_thread.cc:99
    #14 0x331ac07aa0 in start_thread (/lib64/libpthread.so.0+0x331ac07aa0)
    #15 0x331a8e8bcc in clone (/lib64/libc.so.6+0x331a8e8bcc)
0x60520005c37f is located 2047 bytes inside of 2048-byte region [0x60520005bb80,0x60520005c380)
==35520== AddressSanitizer CHECK failed: ../../../../libsanitizer/asan/asan_allocator2.cc:216 "((id)) != (0)" (0x0, 0x0)
    #0 0x7f25962423dd in __asan::AsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_rtl.cc:60
    #1 0x7f2596249123 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/sanitizer_common/../../../../libsanitizer/sanitizer_common/sanitizer_common.cc:57
    #2 0x7f25962356ab in __asan::GetStackTraceFromId(unsigned int, __sanitizer::StackTrace*) /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_allocator2.cc:216
    #3 0x7f2596246e7a in __asan::DescribeHeapAddress(unsigned long, unsigned long) /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_report.cc:342
    #4 0x7f2596247f61 in __asan_report_error /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_report.cc:693
    #5 0x7f2596242763 in __asan_report_load1 /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_rtl.cc:226
    #6 0x51cc5b in MappingData** std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<MappingData*>(MappingData* const*, MappingData* const*, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:372
    #7 0x51c9f5 in MappingData** std::__copy_move_a<false, MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:390
    #8 0x51c502 in MappingData** std::__copy_move_a2<false, MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:428
    #9 0x51b916 in MappingData** std::copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:460
    #10 0x519852 in MappingData** std::__uninitialized_copy<true>::__uninit_copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:93
    #11 0x5158d6 in MappingData** std::uninitialized_copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:117
    #12 0x511aeb in MappingData** std::__uninitialized_copy_a<MappingData**, MappingData**, MappingData*>(MappingData**, MappingData**, MappingData**, std::allocator<MappingData*>&) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:258
    #13 0x50df13 in MappingData** std::__uninitialized_move_if_noexcept_a<MappingData**, MappingData**, std::allocator<MappingData*> >(MappingData**, MappingData**, MappingData**, std::allocator<MappingData*>&) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:281
    #14 0x509202 in std::vector<MappingData*, std::allocator<MappingData*> >::_M_insert_aux(__gnu_cxx::__normal_iterator<MappingData**, std::vector<MappingData*, std::allocator<MappingData*> > >, MappingData* const&) /home/olumide/4.8.5/include/c++/4.8.5/bits/vector.tcc:362
    #15 0x5060ac in std::vector<MappingData*, std::allocator<MappingData*> >::push_back(MappingData* const&) /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_vector.h:913
    #16 0x4f1270 in Queue::publishMappingData(MappingData*) /home/olumide/repo/source/app/src/framework/queue.cpp:149
    #17 0x7f258c449cd7 in Manager::communicationThread() (/home/fmeprod/apps/current/celoxica.so+0x2bcd7)
    #18 0x7f25995f9e82 in thread_proxy (/home/repo/boost/boost_1_56_x64/lib/libboost_thread.so.1.56.0+0x10e82)
    #19 0x7f2596248b87 in __asan::AsanThread::ThreadStart() /home/olumide/tmp/build/gcc-4.8.5/gcc-build-4.8.5/x86_64-unknown-linux-gnu/libsanitizer/asan/../../../../libsanitizer/asan/asan_thread.cc:99
    #20 0x331ac07aa0 in start_thread (/lib64/libpthread.so.0+0x331ac07aa0)
    #21 0x331a8e8bcc in clone (/lib64/libc.so.6+0x331a8e8bcc)

我无法 post 代码,因为它太大了,而且我雇主的 属性 尽管如此,代码相关部分的本质是:

# Thread 1
void Manager::communicationThread()
{
    MappingData* data = new MappingData(...)
    ...
    m_queue.publishMappingData( data ); // m_queue is available to all threads 
    // data is not referenced or deallocated
}

void Queue::publishMappingData(MappingData*)
{
    ...
    // m_buffer is a member of type std::vector<MappingData*> m_buffer;
    m_buffer.push_back( data );
    // contents of m_buffer are ONLY deallocated on shutdown
}

奇怪的是:

  1. 线程1创建的指针创建后未被删除或被线程1访问
  2. 线程 2 存储在 m_buffer 中的指针 不会 删除,直到应用程序关闭。

其余 10 次崩溃发生在迭代 相同 m_buffer 对象的内容时,如下所示

# Thread 3
void Transaction::completion()
{
    ...
    m_queue.publishStatus();  // m_queue is available to all threads 
    ...
}

void Queue::publishStatus()
{
    ...
    for( int i = 0; i < m_buffer.size(); ++i )
    {
        .. new StatusCode( m_buffer[i]->m_id ); // crashes here
        ...                                     // m_id is a member of MappingData
    }
}   

我知道标准库中存在错误的可能性基本上为 0%,但我不知道如何进行。 我唯一能想到的就是比较从跟踪的第 10 行开始的指针宽度的差异。我认为这是由于库之间的不兼容,但我使用 Linux 应用程序文件来检查是否所有应用程序和共享对象都是 64 位的。 (它们是。)(从跟踪的第 10 行开始的指针宽度差异是由于堆栈跟踪在十六进制地址中省略了前导零。)

更新

由于担心 AddressSanitizer 可能会叫狼来了,我决定恢复为不使用 asan 进行编译,并使用 gdb 进行调试。我还使用 gcc 4.4.7 和 4.8.5 构建了应用程序(我知道 很古老,但这些是我们现在只能使用的编译器,并且它们运行良好 - 直到现在 ).两个二进制文件都产生与 asan-build

相似的痕迹

gcc 4.4.7

#0  _wordcopy_fwd_aligned (dstp=140736214036472, srcp=140735609323312, len=75535088) at wordcopy.c:101
#1  0x000000331a8839d2 in memmove (dest=0x7fffb40b5fc0, src=<value optimized out>, len=604284864) at memmove.c:73
#2  0x0000000000481d86 in __copy_m<MappingData*> (this=0x7116e0, __position=, __x=<value optimized out>) at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_algobase.h:378
#3  __copy_move_a<false, MappingData**, MappingData**> (this=0x7116e0, __position=, __x=<value optimized out>)
    at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_algobase.h:397
#4  __copy_move_a2<false, MappingData**, MappingData**> (this=0x7116e0, __position=, __x=<value optimized out>)
    at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_algobase.h:436
#5  copy<MappingData**, MappingData**> (this=0x7116e0, __position=, __x=<value optimized out>) at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_algobase.h:468
#6  uninitialized_copy<MappingData**, MappingData**> (this=0x7116e0, __position=, __x=<value optimized out>)
    at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_uninitialized.h:92
#7  uninitialized_copy<MappingData**, MappingData**> (this=0x7116e0, __position=, __x=<value optimized out>)
    at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_uninitialized.h:116
#8  __uninitialized_copy_a<MappingData**, MappingData**, MappingData*> (this=0x7116e0, __position=, __x=<value optimized out>)
    at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_uninitialized.h:256
#9  __uninitialized_move_a<MappingData**, MappingData**, std::allocator<MappingData*> > (this=0x7116e0, __position=, __x=<value optimized out>)
    at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_uninitialized.h:266
#10 std::vector<MappingData*, std::allocator<MappingData*> >::_M_insert_aux (this=0x7116e0, __position=, __x=<value optimized out>)
    at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/vector.tcc:338
#11 0x0000000000472e91 in push_back (this=0x711590, data=0x7fffb40b5d50) at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/bits/stl_vector.h:741
#12 Queue::publishMappingData (this=0x711590, data=0x7fffb40b5d50) at src/framework/queue.cpp:149

gcc 4.8.5

#0  0x000000331a889e1a in _wordcopy_bwd_aligned (dstp=140736349684976, srcp=140736348970864, len=84044416) at wordcopy.c:293
#1  0x000000331a8839ba in memmove (dest=0x7fff940df128, src=<value optimized out>, len=672355336) at memmove.c:99
#2  0x00000000004bd9b8 in MappingData** std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<MappingData*>(MappingData* const*, MappingData* const*, MappingData**) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:372
#3  0x00000000004bd87e in MappingData** std::__copy_move_a<false, MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:390
#4  0x00000000004bd5ac in MappingData** std::__copy_move_a2<false, MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:428
#5  0x00000000004bcf91 in MappingData** std::copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) ()
    at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_algobase.h:460
#6  0x00000000004bbeb7 in MappingData** std::__uninitialized_copy<true>::__uninit_copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:93
#7  0x00000000004ba28d in MappingData** std::uninitialized_copy<MappingData**, MappingData**>(MappingData**, MappingData**, MappingData**) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:117
#8  0x00000000004b80dd in MappingData** std::__uninitialized_copy_a<MappingData**, MappingData**, MappingData*>(MappingData**, MappingData**, MappingData**, std::allocator<MappingData*>&) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:258
#9  0x00000000004b5cfc in MappingData** std::__uninitialized_move_if_noexcept_a<MappingData**, MappingData**, std::allocator<MappingData*> >(MappingData**, MappingData**, MappingData**, std::allocator<MappingData*>&) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_uninitialized.h:281
#10 0x00000000004b3239 in std::vector<MappingData*, std::allocator<MappingData*> >::_M_insert_aux(__gnu_cxx::__normal_iterator<MappingData**, std::vector<MappingData*, std::allocator<MappingData*> > >, MappingData* const&) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/vector.tcc:369
#11 0x00000000004b1398 in std::vector<MappingData*, std::allocator<MappingData*> >::push_back(MappingData* const&) () at /home/olumide/4.8.5/include/c++/4.8.5/bits/stl_vector.h:913
#12 0x00000000004a4e43 in Queue::publishMappingData(MappingData*) () at src/framework/queue.cpp:149

让我印象深刻的是传递给 _wordcopy_bwd_aligned (gcc 4.8.5) 和 _wordcopy_fwd_aligned (gcc 4.4.7) 的 len 变量将近 1 亿,而传递给 memmove 两种情况都超过5000万! (回想一下向量存储指针。)

传递给 glibc 函数的长度由 std::__copy_move 计算,并且是 __last__first 指针最终由 std::vector<_Tp,_Alloc>::_M_insert_aux 传递给它的指针差异。对该成员函数模板的快速检查表明,导致崩溃的分支是因为 this->_M_impl._M_finish == this->_M_impl._M_end_of_storage 即 vector 内存不足而不得不重新分配和重新定位其内容。但这就是向量的设计目的。我不知道上面的推理是否正确,但我已经走了这么远。

开始解决这个问题的一些提示:

  • 使用unique_ptr代替新的

  • 为线程提供一个 ID 并断言这些函数是否确实从您期望的线程中调用。如果一个线程 推回 m_buffer 它可能会将向量中的所有项目重新分配到不同的内存中,如果您推入的项目将超过向量的当前容量,这将使当前持有的迭代器无效。

  • 这对我来说意义不大,难道不是矢量的大小吗?它正在向此处的指针递增一个 int,并且可能会越界:

    for(int i = 0; i < m_buffer; ++i)
    

我没有看到任何线程同步原语。 可能发生的情况是 m_buffer.push_back 在一个线程中触发 realloc。 但是另一个读取向量的线程在向量被重新分配和复制之前仍然访问旧的内存区域。 换句话说,不是关于 MappingData* 指针,而是关于 vector class 中存储这些指针的内存区域。当 vector 达到其当前容量时,该区域被释放,然后在线程 A 中再次分配。线程 B 开始访问 m_buffer[i]m_buffer.push_back() 内的数据 并崩溃,因为此内存区域不再属于该进程。