跟踪第 3 方库中的分段错误:cv::ImageCodecInitializer 析构函数崩溃
Tracing a segmentation fault in a 3rd party library: cv::ImageCodecInitializer destructor crashes
我们正在开发一个框架,它直接使用 mrpt-1.9,后者又使用 OpenCV 2.4。
我们正在编写单元测试,当测试存在时(例如,在清理期间)出现段错误并出现 OpenCV 错误:cv::String::deallocate()
我尝试过的:
运行 valgrind
==26159== Conditional jump or move depends on uninitialised value(s)
==26159== at 0x7DB7F5: cv::String::deallocate() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0xAF9FB0: cv::BmpEncoder::~BmpEncoder() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0xAF9FF8: cv::BmpEncoder::~BmpEncoder() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0x935AF65: cv::ImageCodecInitializer::~ImageCodecInitializer() (in /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9)
==26159== by 0x807A369: __cxa_finalize (cxa_finalize.c:56)
==26159== by 0x9355B52: ??? (in /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9)
==26159== by 0x4010DE6: _dl_fini (dl-fini.c:235)
==26159== by 0x8079FF7: __run_exit_handlers (exit.c:82)
==26159== by 0x807A044: exit (exit.c:104)
==26159== by 0x8060836: (below main) (libc-start.c:325)
==26159==
==26159== Invalid read of size 4
==26159== at 0x7DB7FB: cv::String::deallocate() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0xAF9FB9: cv::BmpEncoder::~BmpEncoder() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0xAF9FF8: cv::BmpEncoder::~BmpEncoder() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0x935AF65: cv::ImageCodecInitializer::~ImageCodecInitializer() (in /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9)
==26159== by 0x807A369: __cxa_finalize (cxa_finalize.c:56)
==26159== by 0x9355B52: ??? (in /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9)
==26159== by 0x4010DE6: _dl_fini (dl-fini.c:235)
==26159== by 0x8079FF7: __run_exit_handlers (exit.c:82)
==26159== by 0x807A044: exit (exit.c:104)
==26159== by 0x8060836: (below main) (libc-start.c:325)
==26159== Address 0x1a is not stack'd, malloc'd or (recently) free'd
==26159==
==26159==
==26159== Process terminating with default action of signal 11 (SIGSEGV)
==26159== Access not within mapped region at address 0x1A
==26159== at 0x7DB7FB: cv::String::deallocate() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0xAF9FB9: cv::BmpEncoder::~BmpEncoder() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0xAF9FF8: cv::BmpEncoder::~BmpEncoder() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0x935AF65: cv::ImageCodecInitializer::~ImageCodecInitializer() (in /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9)
==26159== by 0x807A369: __cxa_finalize (cxa_finalize.c:56)
==26159== by 0x9355B52: ??? (in /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9)
==26159== by 0x4010DE6: _dl_fini (dl-fini.c:235)
==26159== by 0x8079FF7: __run_exit_handlers (exit.c:82)
==26159== by 0x807A044: exit (exit.c:104)
==26159== by 0x8060836: (below main) (libc-start.c:325)
==26159== If you believe this happened as a result of a stack
==26159== overflow in your program's main thread (unlikely but
==26159== possible), you can try to increase the size of the
==26159== main thread stack using the --main-stacksize= flag.
==26159== The main thread stack size used in this run was 8388608.
==26159==
==26159== HEAP SUMMARY:
==26159== in use at exit: 286,067 bytes in 1,147 blocks
==26159== total heap usage: 7,469 allocs, 6,322 frees, 1,912,969 bytes allocated
==26159==
==26159== LEAK SUMMARY:
==26159== definitely lost: 0 bytes in 0 blocks
==26159== indirectly lost: 0 bytes in 0 blocks
==26159== possibly lost: 2,299 bytes in 27 blocks
==26159== still reachable: 283,768 bytes in 1,120 blocks
==26159== of which reachable via heuristic:
==26159== newarray : 1,536 bytes in 16 blocks
==26159== suppressed: 0 bytes in 0 blocks
==26159== Rerun with --leak-check=full to see details of leaked memory
==26159==
==26159== For counts of detected and suppressed errors, rerun with: -v
==26159== Use --track-origins=yes to see where uninitialised values come from
==26159== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
据我所知,这可能是我们错误地调用了 MRPT 函数,也可能是 MRPT 本身存在错误。
运行 它与 gdb:
我一直试图在 gdb 中调试它,但我只能得到回溯,但不知道我们代码的哪一部分是负责它的。由于它似乎发生在 main 退出之后,因此确实令人困惑。
更糟糕的是,我们构建的 class(但实际上没有做任何事情)不包含任何 MRPT classes 或对象,所以我猜这是在 MRPT 库中而不是我们的框架中。
Thread 1 "debug" received signal SIGSEGV, Segmentation fault.
0x00000000005b569b in cv::String::deallocate() ()
(gdb) bt
#0 0x00000000005b569b in cv::String::deallocate() ()
#1 0x000000000089969a in cv::BmpEncoder::~BmpEncoder() ()
#2 0x00000000008996d9 in cv::BmpEncoder::~BmpEncoder() [clone .localalias.25] ()
#3 0x00007ffff36a4f66 in cv::ImageCodecInitializer::~ImageCodecInitializer() () from /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4
#4 0x00007ffff484136a in __cxa_finalize (d=0x7ffff38d1000) at cxa_finalize.c:56
#5 0x00007ffff369fb53 in ?? () from /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4
#6 0x00007fffffffd8b0 in ?? ()
#7 0x00007ffff7de7de7 in _dl_fini () at dl-fini.c:235
Backtrace stopped: frame did not save the PC
我在break cv::ImageCodecInitializer::~ImageCodecInitializer
设置了一个断点
我得到了:
Thread 1 "debug" hit Breakpoint 3, 0x0000000000888ad0 in cv::ImageCodecInitializer::~ImageCodecInitializer() ()
(gdb) bt
#0 0x0000000000888ad0 in cv::ImageCodecInitializer::~ImageCodecInitializer() ()
#1 0x00007ffff4840ff8 in __run_exit_handlers (status=0, listp=0x7ffff4bcb5f8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#2 0x00007ffff4841045 in __GI_exit (status=<optimised out>) at exit.c:104
#3 0x00007ffff4827837 in __libc_start_main (main=0x5a4536 <main()>, argc=1, argv=0x7fffffffd9d8, init=<optimised out>, fini=<optimised out>, rtld_fini=<optimised out>, stack_end=0x7fffffffd9c8) at ../csu/libc-start.c:325
#4 0x00000000005a4469 in _start ()
搜索 opencv-2.4 调试
该应用程序是使用调试符号构建的,但系统似乎没有带有调试符号的 opencv-2.4,所以我一直收到 优化 警告。
libopencv-apps-dev - opencv_apps Robot OS package - development files
libopencv-apps0d - opencv_apps Robot OS package - runtime files
libopencv-calib3d2.4v5 - computer vision Camera Calibration library
libopencv-contrib-dev - development files for libopencv-contrib
libopencv-contrib2.4v5 - computer vision contrib library
libopencv-core2.4v5 - computer vision core library
libopencv-dev - development files for opencv
libopencv-features2d2.4v5 - computer vision Feature Detection and Descriptor Extraction library
libopencv-flann2.4v5 - computer vision Clustering and Search in Multi-Dimensional spaces library
libopencv-gpu-dev - development files for libopencv-gpu2.4v5
libopencv-gpu2.4v5 - computer vision GPU library
libopencv-highgui2.4v5 - computer vision High-level GUI and Media I/O library
libopencv-imgproc2.4v5 - computer vision Image Processing library
libopencv-legacy-dev - development files for libopencv-legacy
libopencv-legacy2.4v5 - computer vision legacy library
libopencv-ml2.4v5 - computer vision Machine Learning library
libopencv-objdetect2.4v5 - computer vision Object Detection library
libopencv-ocl-dev - development files for libopencv-ocl2.4v5
libopencv-ocl2.4v5 - computer vision OpenCL support library
libopencv-photo2.4v5 - computer vision computational photography library
libopencv-stitching2.4v5 - computer vision image stitching library
libopencv-superres2.4v5 - computer vision Super Resolution library
libopencv-ts2.4v5 - computer vision ts library
libopencv-video2.4v5 - computer vision Video analysis library
libopencv-videostab2.4v5 - computer vision video stabilization library
libopencv2.4-java - Java bindings for the computer vision library
libopencv2.4-jni - Java jni library for the computer vision library
搜索了实际的违规功能点
我查看了我们构建的缩小调试可执行文件以尝试查明问题,然后尝试搜索实际功能:
nm -Ca debug | grep "ImageCodecInitializer"
0000000000889290 W cv::ImageCodecInitializer::ImageCodecInitializer()
0000000000889290 W cv::ImageCodecInitializer::ImageCodecInitializer()
0000000000888ad0 W cv::ImageCodecInitializer::~ImageCodecInitializer()
0000000000888ad0 W cv::ImageCodecInitializer::~ImageCodecInitializer()
然后我试图找出 GDB 对这些地址的看法:
(gdb) info line *0x0000000000889290
No line number information available for address 0x889290 <_ZN2cv21ImageCodecInitializerC2Ev>
但是我不能从那里去任何地方,所以我在 GDB 中搜索以查找谁构造了这个:
#0 0x00007ffff36a6240 in cv::ImageCodecInitializer::ImageCodecInitializer() () from /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4
#1 0x00007ffff369f8f6 in ?? () from /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4
#2 0x00007ffff7de76ba in call_init (l=<optimised out>, argc=argc@entry=1, argv=argv@entry=0x7fffffffd9d8, env=env@entry=0x7fffffffd9e8) at dl-init.c:72
#3 0x00007ffff7de77cb in call_init (env=0x7fffffffd9e8, argv=0x7fffffffd9d8, argc=1, l=<optimised out>) at dl-init.c:30
#4 _dl_init (main_map=0x7ffff7ffe168, argc=1, argv=0x7fffffffd9d8, env=0x7fffffffd9e8) at dl-init.c:120
#5 0x00007ffff7dd7c6a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#6 0x0000000000000001 in ?? ()
#7 0x00007fffffffdda0 in ?? ()
#8 0x0000000000000000 in ?? ()
再次优化出来。
搜索了使用违规函数的库
该函数在 libopencv_highgui.so.2.4
中,所以我猜测其中一个 MRPT 库正在使用它,所以我去搜索我们链接的 MRPT 库正在使用它,并找到了它:
readelf -d debug
Dynamic section at offset 0x2b49bb0 contains 41 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libboost_system.so.1.58.0]
0x0000000000000001 (NEEDED) Shared library: [libboost_filesystem.so.1.58.0]
0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0]
0x0000000000000001 (NEEDED) Shared library: [libdl.so.2]
0x0000000000000001 (NEEDED) Shared library: [librt.so.1]
0x0000000000000001 (NEEDED) Shared library: [libmrpt-base.so.1.9]
0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6]
0x0000000000000001 (NEEDED) Shared library: [libjpeg.so.8]
0x0000000000000001 (NEEDED) Shared library: [libpng12.so.0]
0x0000000000000001 (NEEDED) Shared library: [libtiff.so.5]
0x0000000000000001 (NEEDED) Shared library: [libjasper.so.1]
0x0000000000000001 (NEEDED) Shared library: [libz.so.1]
0x0000000000000001 (NEEDED) Shared library: [libIlmImf-2_2.so.22]
0x0000000000000001 (NEEDED) Shared library: [libHalf.so.12]
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libgcc_s.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
所以,我发现:
sudo ldconfig -p | grep "libmrpt-base.so.1.9"
libmrpt-base.so.1.9 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libmrpt-base.so.1.9
然后:
readelf -d /usr/lib/x86_64-linux-gnu/libmrpt-base.so.1.9
Dynamic section at offset 0xa5aea8 contains 37 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [librt.so.1]
0x0000000000000001 (NEEDED) Shared library: [libcxsparse.so.3.1.4]
0x0000000000000001 (NEEDED) Shared library: [libwx_baseu-3.0.so.0]
0x0000000000000001 (NEEDED) Shared library: [libwx_gtk2u_core-3.0.so.0]
0x0000000000000001 (NEEDED) Shared library: [libz.so.1]
0x0000000000000001 (NEEDED) Shared library: [libjpeg.so.8]
0x0000000000000001 (NEEDED) Shared library: [libopencv_highgui.so.2.4]
0x0000000000000001 (NEEDED) Shared library: [libopencv_imgproc.so.2.4]
0x0000000000000001 (NEEDED) Shared library: [libopencv_core.so.2.4]
0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6]
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libgcc_s.so.1]
0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x000000000000000e (SONAME) Library soname: [libmrpt-base.so.1.9]
我知道这是造成问题的库,因为在我们的项目中我们使用 opencv-3.3 静态链接它。
遗憾的是,我们使用的存储库也没有 MRPT 的调试符号:
libmrpt-base1.9 - Mobile Robot Programming Toolkit - base library
libmrpt-detectors1.9 - Mobile Robot Programming Toolkit - detectors library
libmrpt-graphs1.9 - Mobile Robot Programming Toolkit - graphs library
libmrpt-graphslam1.9 - Mobile Robot Programming Toolkit - graphslam library
libmrpt-gui1.9 - Mobile Robot Programming Toolkit - gui library
libmrpt-hmtslam1.9 - Mobile Robot Programming Toolkit - hmtslam library
libmrpt-hwdrivers1.9 - Mobile Robot Programming Toolkit - hwdrivers library
libmrpt-kinematics1.9 - Mobile Robot Programming Toolkit - kinematics library
libmrpt-maps1.9 - Mobile Robot Programming Toolkit - maps library
libmrpt-nav1.9 - Mobile Robot Programming Toolkit - nav library
libmrpt-obs1.9 - Mobile Robot Programming Toolkit - obs library
libmrpt-opengl1.9 - Mobile Robot Programming Toolkit - opengl library
libmrpt-slam1.9 - Mobile Robot Programming Toolkit - slam library
libmrpt-tfest1.9 - Mobile Robot Programming Toolkit - tfest library
libmrpt-topography1.9 - Mobile Robot Programming Toolkit - topography library
libmrpt-vision1.9 - Mobile Robot Programming Toolkit - vision library
libmrpt-comms1.9 - Mobile Robot Programming Toolkit - comms library
更糟糕的是:
nm -C libmrpt-base.so
nm: libmrpt-base.so: no symbols
这就是旅程的终点。
我有哪些选择?
- 使用其他版本的 mrpt?
- 用调试符号编译 mrpt?
- 使用调试符号编译 opencv-2.4?
非常感谢任何帮助、提示或提示。
如果这个问题太本地化,不符合SO标准,欢迎留言,我会更新。
我的第一个猜测是,您可能会因为同时使用两个 opencv 版本而遇到此问题...
尝试从源代码构建 mrpt,告诉 CMake 使用与主项目相同的 opencv 版本。
mrpt-base 不直接使用 highgui 中的任何东西(尽管...它链接到它!这应该是固定的,四确定),所以我怀疑这个错误与 opencv 中静态变量的初始化有关模块和链接器有问题...
干杯
不是真正的答案,但注释不利于格式化代码。 github上最新的opencv有如下来源
void cv::String::deallocate()
{
int* data = (int*)cstr_;
len_ = 0;
cstr_ = 0;
if(data && 1 == CV_XADD(data-1, -1))
{
cv::fastFree(data-1);
}
}
(可能比您的版本更新)。
看起来这是将字符串存储为前 4 个字节中的引用计数,后跟以 nul 结尾的字符串。 if
条件检查指针是否为 NULL,然后看起来它正在对引用计数进行原子递减,并在计数下降到 1 时释放内存。
我们正在开发一个框架,它直接使用 mrpt-1.9,后者又使用 OpenCV 2.4。
我们正在编写单元测试,当测试存在时(例如,在清理期间)出现段错误并出现 OpenCV 错误:cv::String::deallocate()
我尝试过的:
运行 valgrind
==26159== Conditional jump or move depends on uninitialised value(s)
==26159== at 0x7DB7F5: cv::String::deallocate() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0xAF9FB0: cv::BmpEncoder::~BmpEncoder() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0xAF9FF8: cv::BmpEncoder::~BmpEncoder() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0x935AF65: cv::ImageCodecInitializer::~ImageCodecInitializer() (in /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9)
==26159== by 0x807A369: __cxa_finalize (cxa_finalize.c:56)
==26159== by 0x9355B52: ??? (in /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9)
==26159== by 0x4010DE6: _dl_fini (dl-fini.c:235)
==26159== by 0x8079FF7: __run_exit_handlers (exit.c:82)
==26159== by 0x807A044: exit (exit.c:104)
==26159== by 0x8060836: (below main) (libc-start.c:325)
==26159==
==26159== Invalid read of size 4
==26159== at 0x7DB7FB: cv::String::deallocate() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0xAF9FB9: cv::BmpEncoder::~BmpEncoder() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0xAF9FF8: cv::BmpEncoder::~BmpEncoder() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0x935AF65: cv::ImageCodecInitializer::~ImageCodecInitializer() (in /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9)
==26159== by 0x807A369: __cxa_finalize (cxa_finalize.c:56)
==26159== by 0x9355B52: ??? (in /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9)
==26159== by 0x4010DE6: _dl_fini (dl-fini.c:235)
==26159== by 0x8079FF7: __run_exit_handlers (exit.c:82)
==26159== by 0x807A044: exit (exit.c:104)
==26159== by 0x8060836: (below main) (libc-start.c:325)
==26159== Address 0x1a is not stack'd, malloc'd or (recently) free'd
==26159==
==26159==
==26159== Process terminating with default action of signal 11 (SIGSEGV)
==26159== Access not within mapped region at address 0x1A
==26159== at 0x7DB7FB: cv::String::deallocate() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0xAF9FB9: cv::BmpEncoder::~BmpEncoder() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0xAF9FF8: cv::BmpEncoder::~BmpEncoder() (in /home/alex/codez/robot_platform/build/test_slam)
==26159== by 0x935AF65: cv::ImageCodecInitializer::~ImageCodecInitializer() (in /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9)
==26159== by 0x807A369: __cxa_finalize (cxa_finalize.c:56)
==26159== by 0x9355B52: ??? (in /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9)
==26159== by 0x4010DE6: _dl_fini (dl-fini.c:235)
==26159== by 0x8079FF7: __run_exit_handlers (exit.c:82)
==26159== by 0x807A044: exit (exit.c:104)
==26159== by 0x8060836: (below main) (libc-start.c:325)
==26159== If you believe this happened as a result of a stack
==26159== overflow in your program's main thread (unlikely but
==26159== possible), you can try to increase the size of the
==26159== main thread stack using the --main-stacksize= flag.
==26159== The main thread stack size used in this run was 8388608.
==26159==
==26159== HEAP SUMMARY:
==26159== in use at exit: 286,067 bytes in 1,147 blocks
==26159== total heap usage: 7,469 allocs, 6,322 frees, 1,912,969 bytes allocated
==26159==
==26159== LEAK SUMMARY:
==26159== definitely lost: 0 bytes in 0 blocks
==26159== indirectly lost: 0 bytes in 0 blocks
==26159== possibly lost: 2,299 bytes in 27 blocks
==26159== still reachable: 283,768 bytes in 1,120 blocks
==26159== of which reachable via heuristic:
==26159== newarray : 1,536 bytes in 16 blocks
==26159== suppressed: 0 bytes in 0 blocks
==26159== Rerun with --leak-check=full to see details of leaked memory
==26159==
==26159== For counts of detected and suppressed errors, rerun with: -v
==26159== Use --track-origins=yes to see where uninitialised values come from
==26159== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
据我所知,这可能是我们错误地调用了 MRPT 函数,也可能是 MRPT 本身存在错误。
运行 它与 gdb:
我一直试图在 gdb 中调试它,但我只能得到回溯,但不知道我们代码的哪一部分是负责它的。由于它似乎发生在 main 退出之后,因此确实令人困惑。 更糟糕的是,我们构建的 class(但实际上没有做任何事情)不包含任何 MRPT classes 或对象,所以我猜这是在 MRPT 库中而不是我们的框架中。
Thread 1 "debug" received signal SIGSEGV, Segmentation fault.
0x00000000005b569b in cv::String::deallocate() ()
(gdb) bt
#0 0x00000000005b569b in cv::String::deallocate() ()
#1 0x000000000089969a in cv::BmpEncoder::~BmpEncoder() ()
#2 0x00000000008996d9 in cv::BmpEncoder::~BmpEncoder() [clone .localalias.25] ()
#3 0x00007ffff36a4f66 in cv::ImageCodecInitializer::~ImageCodecInitializer() () from /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4
#4 0x00007ffff484136a in __cxa_finalize (d=0x7ffff38d1000) at cxa_finalize.c:56
#5 0x00007ffff369fb53 in ?? () from /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4
#6 0x00007fffffffd8b0 in ?? ()
#7 0x00007ffff7de7de7 in _dl_fini () at dl-fini.c:235
Backtrace stopped: frame did not save the PC
我在break cv::ImageCodecInitializer::~ImageCodecInitializer
我得到了:
Thread 1 "debug" hit Breakpoint 3, 0x0000000000888ad0 in cv::ImageCodecInitializer::~ImageCodecInitializer() ()
(gdb) bt
#0 0x0000000000888ad0 in cv::ImageCodecInitializer::~ImageCodecInitializer() ()
#1 0x00007ffff4840ff8 in __run_exit_handlers (status=0, listp=0x7ffff4bcb5f8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#2 0x00007ffff4841045 in __GI_exit (status=<optimised out>) at exit.c:104
#3 0x00007ffff4827837 in __libc_start_main (main=0x5a4536 <main()>, argc=1, argv=0x7fffffffd9d8, init=<optimised out>, fini=<optimised out>, rtld_fini=<optimised out>, stack_end=0x7fffffffd9c8) at ../csu/libc-start.c:325
#4 0x00000000005a4469 in _start ()
搜索 opencv-2.4 调试
该应用程序是使用调试符号构建的,但系统似乎没有带有调试符号的 opencv-2.4,所以我一直收到 优化 警告。
libopencv-apps-dev - opencv_apps Robot OS package - development files
libopencv-apps0d - opencv_apps Robot OS package - runtime files
libopencv-calib3d2.4v5 - computer vision Camera Calibration library
libopencv-contrib-dev - development files for libopencv-contrib
libopencv-contrib2.4v5 - computer vision contrib library
libopencv-core2.4v5 - computer vision core library
libopencv-dev - development files for opencv
libopencv-features2d2.4v5 - computer vision Feature Detection and Descriptor Extraction library
libopencv-flann2.4v5 - computer vision Clustering and Search in Multi-Dimensional spaces library
libopencv-gpu-dev - development files for libopencv-gpu2.4v5
libopencv-gpu2.4v5 - computer vision GPU library
libopencv-highgui2.4v5 - computer vision High-level GUI and Media I/O library
libopencv-imgproc2.4v5 - computer vision Image Processing library
libopencv-legacy-dev - development files for libopencv-legacy
libopencv-legacy2.4v5 - computer vision legacy library
libopencv-ml2.4v5 - computer vision Machine Learning library
libopencv-objdetect2.4v5 - computer vision Object Detection library
libopencv-ocl-dev - development files for libopencv-ocl2.4v5
libopencv-ocl2.4v5 - computer vision OpenCL support library
libopencv-photo2.4v5 - computer vision computational photography library
libopencv-stitching2.4v5 - computer vision image stitching library
libopencv-superres2.4v5 - computer vision Super Resolution library
libopencv-ts2.4v5 - computer vision ts library
libopencv-video2.4v5 - computer vision Video analysis library
libopencv-videostab2.4v5 - computer vision video stabilization library
libopencv2.4-java - Java bindings for the computer vision library
libopencv2.4-jni - Java jni library for the computer vision library
搜索了实际的违规功能点
我查看了我们构建的缩小调试可执行文件以尝试查明问题,然后尝试搜索实际功能:
nm -Ca debug | grep "ImageCodecInitializer"
0000000000889290 W cv::ImageCodecInitializer::ImageCodecInitializer()
0000000000889290 W cv::ImageCodecInitializer::ImageCodecInitializer()
0000000000888ad0 W cv::ImageCodecInitializer::~ImageCodecInitializer()
0000000000888ad0 W cv::ImageCodecInitializer::~ImageCodecInitializer()
然后我试图找出 GDB 对这些地址的看法:
(gdb) info line *0x0000000000889290
No line number information available for address 0x889290 <_ZN2cv21ImageCodecInitializerC2Ev>
但是我不能从那里去任何地方,所以我在 GDB 中搜索以查找谁构造了这个:
#0 0x00007ffff36a6240 in cv::ImageCodecInitializer::ImageCodecInitializer() () from /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4
#1 0x00007ffff369f8f6 in ?? () from /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4
#2 0x00007ffff7de76ba in call_init (l=<optimised out>, argc=argc@entry=1, argv=argv@entry=0x7fffffffd9d8, env=env@entry=0x7fffffffd9e8) at dl-init.c:72
#3 0x00007ffff7de77cb in call_init (env=0x7fffffffd9e8, argv=0x7fffffffd9d8, argc=1, l=<optimised out>) at dl-init.c:30
#4 _dl_init (main_map=0x7ffff7ffe168, argc=1, argv=0x7fffffffd9d8, env=0x7fffffffd9e8) at dl-init.c:120
#5 0x00007ffff7dd7c6a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#6 0x0000000000000001 in ?? ()
#7 0x00007fffffffdda0 in ?? ()
#8 0x0000000000000000 in ?? ()
再次优化出来。
搜索了使用违规函数的库
该函数在 libopencv_highgui.so.2.4
中,所以我猜测其中一个 MRPT 库正在使用它,所以我去搜索我们链接的 MRPT 库正在使用它,并找到了它:
readelf -d debug
Dynamic section at offset 0x2b49bb0 contains 41 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libboost_system.so.1.58.0]
0x0000000000000001 (NEEDED) Shared library: [libboost_filesystem.so.1.58.0]
0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0]
0x0000000000000001 (NEEDED) Shared library: [libdl.so.2]
0x0000000000000001 (NEEDED) Shared library: [librt.so.1]
0x0000000000000001 (NEEDED) Shared library: [libmrpt-base.so.1.9]
0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6]
0x0000000000000001 (NEEDED) Shared library: [libjpeg.so.8]
0x0000000000000001 (NEEDED) Shared library: [libpng12.so.0]
0x0000000000000001 (NEEDED) Shared library: [libtiff.so.5]
0x0000000000000001 (NEEDED) Shared library: [libjasper.so.1]
0x0000000000000001 (NEEDED) Shared library: [libz.so.1]
0x0000000000000001 (NEEDED) Shared library: [libIlmImf-2_2.so.22]
0x0000000000000001 (NEEDED) Shared library: [libHalf.so.12]
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libgcc_s.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
所以,我发现:
sudo ldconfig -p | grep "libmrpt-base.so.1.9"
libmrpt-base.so.1.9 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libmrpt-base.so.1.9
然后:
readelf -d /usr/lib/x86_64-linux-gnu/libmrpt-base.so.1.9
Dynamic section at offset 0xa5aea8 contains 37 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [librt.so.1]
0x0000000000000001 (NEEDED) Shared library: [libcxsparse.so.3.1.4]
0x0000000000000001 (NEEDED) Shared library: [libwx_baseu-3.0.so.0]
0x0000000000000001 (NEEDED) Shared library: [libwx_gtk2u_core-3.0.so.0]
0x0000000000000001 (NEEDED) Shared library: [libz.so.1]
0x0000000000000001 (NEEDED) Shared library: [libjpeg.so.8]
0x0000000000000001 (NEEDED) Shared library: [libopencv_highgui.so.2.4]
0x0000000000000001 (NEEDED) Shared library: [libopencv_imgproc.so.2.4]
0x0000000000000001 (NEEDED) Shared library: [libopencv_core.so.2.4]
0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6]
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libgcc_s.so.1]
0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x000000000000000e (SONAME) Library soname: [libmrpt-base.so.1.9]
我知道这是造成问题的库,因为在我们的项目中我们使用 opencv-3.3 静态链接它。 遗憾的是,我们使用的存储库也没有 MRPT 的调试符号:
libmrpt-base1.9 - Mobile Robot Programming Toolkit - base library
libmrpt-detectors1.9 - Mobile Robot Programming Toolkit - detectors library
libmrpt-graphs1.9 - Mobile Robot Programming Toolkit - graphs library
libmrpt-graphslam1.9 - Mobile Robot Programming Toolkit - graphslam library
libmrpt-gui1.9 - Mobile Robot Programming Toolkit - gui library
libmrpt-hmtslam1.9 - Mobile Robot Programming Toolkit - hmtslam library
libmrpt-hwdrivers1.9 - Mobile Robot Programming Toolkit - hwdrivers library
libmrpt-kinematics1.9 - Mobile Robot Programming Toolkit - kinematics library
libmrpt-maps1.9 - Mobile Robot Programming Toolkit - maps library
libmrpt-nav1.9 - Mobile Robot Programming Toolkit - nav library
libmrpt-obs1.9 - Mobile Robot Programming Toolkit - obs library
libmrpt-opengl1.9 - Mobile Robot Programming Toolkit - opengl library
libmrpt-slam1.9 - Mobile Robot Programming Toolkit - slam library
libmrpt-tfest1.9 - Mobile Robot Programming Toolkit - tfest library
libmrpt-topography1.9 - Mobile Robot Programming Toolkit - topography library
libmrpt-vision1.9 - Mobile Robot Programming Toolkit - vision library
libmrpt-comms1.9 - Mobile Robot Programming Toolkit - comms library
更糟糕的是:
nm -C libmrpt-base.so
nm: libmrpt-base.so: no symbols
这就是旅程的终点。
我有哪些选择?
- 使用其他版本的 mrpt?
- 用调试符号编译 mrpt?
- 使用调试符号编译 opencv-2.4?
非常感谢任何帮助、提示或提示。 如果这个问题太本地化,不符合SO标准,欢迎留言,我会更新。
我的第一个猜测是,您可能会因为同时使用两个 opencv 版本而遇到此问题... 尝试从源代码构建 mrpt,告诉 CMake 使用与主项目相同的 opencv 版本。
mrpt-base 不直接使用 highgui 中的任何东西(尽管...它链接到它!这应该是固定的,四确定),所以我怀疑这个错误与 opencv 中静态变量的初始化有关模块和链接器有问题...
干杯
不是真正的答案,但注释不利于格式化代码。 github上最新的opencv有如下来源
void cv::String::deallocate()
{
int* data = (int*)cstr_;
len_ = 0;
cstr_ = 0;
if(data && 1 == CV_XADD(data-1, -1))
{
cv::fastFree(data-1);
}
}
(可能比您的版本更新)。
看起来这是将字符串存储为前 4 个字节中的引用计数,后跟以 nul 结尾的字符串。 if
条件检查指针是否为 NULL,然后看起来它正在对引用计数进行原子递减,并在计数下降到 1 时释放内存。