Boost 属性 树无法在多线程上下文中检索简单 JSON

Boost Property Tree fails to retrieve simple JSON in multi-threaded context

我正在尝试在我的 C/C++ 应用程序中使用 Boost.PropertyTree 解析一个简单的 JSON 字符串。

{"header":{"version":42,"source":1,"destination":2},"coffee":"colombian"}

以下是我在 C/C++ 多线程应用程序中的设置方式(手动定义 JSON 字符串以演示问题)。

ParseJson.cpp

#ifdef __cplusplus
extern "C"
{
#endif

#include "ParseJson.hpp"

#ifdef __cplusplus
}
#endif

#include <iostream>
#include <sstream>
#include <string>

#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>

using boost::property_tree::ptree;
using boost::property_tree::read_json;
using boost::property_tree::write_json;

extern "C" MyStruct * const parseJsonMessage(char * jsonMessage, unsigned int const messageLength) {
    MyStruct * myStruct = new MyStruct();
    // Create empty property tree object.
    ptree tree;

    if (myStruct != nullptr) {
        try {
            // Create an istringstream from the JSON message.
            std::string jsonMessageString("{\"header\":{\"version\":42,\"source\":1,\"destination\":2},\"coffee\":\"colombian\"}");   // doesn't work
            std::istringstream isStreamJson(jsonMessageString);

            // Parse the JSON into the property tree.
            std::cout << "Reading JSON ..." << jsonMessageString << "...";
            read_json(isStreamJson, tree);
            std::cout << " Done!" << std::endl;

            // Get the values from the property tree.
            printf("version: %d\n", tree.get<int>("header.version"));
            printf("source: %d\n", tree.get<int>("header.source"));
            printf("coffee: %s\n", tree.get<std::string>("coffee").c_str());
        }
        catch (boost::property_tree::ptree_bad_path badPathException) {
            std::cout << "Exception caught for bad path: " << badPathException.what() << std::endl;
            return nullptr;
        }
        catch (boost::property_tree::ptree_bad_data badDataException) {
            std::cout << "Exception caught for bad data: " << badDataException.what() << std::endl;
            return nullptr;
        }
        catch (std::exception exception) {
            std::cout << "Exception caught when parsing message into Boost.Property tree: " << exception.what() << std::endl;
            return nullptr;
        }
    }
    return myStruct;
}

read_json() 调用似乎已完成,但 get() 从 属性 树检索解析数据的调用失败:

Reading JSON ...{"header":{"version":42,"source":1,"destination":2},"coffee":"colombian"}... Done!
Exception caught for bad path: No such node (header.version)

我在 RHEL 7 上使用 Boost 1.53(编译器是 gcc/g++ 版本 4.8.5),我已经尝试了这个 post related to Boost.PropertyTree and multi-threading. I've defined the BOOST_SPIRIT_THREADSAFE 项目全局编译定义中提到的两个建议.我还尝试了为此 post 建议的原子交换解决方案。这些都对症状没有任何影响。

奇怪的是,我可以使用另一个 public methods 用于 Boost.Property 树来手动获取值:

std::cout << "front.key: " << tree.front().first << std::endl;
std::cout << "front.front.key: " << tree.front().second.front().first << std::endl;
std::cout << "front.front.value: " << tree.front().second.front().second.get_value_optional<std::string>() << std::endl;

这表明 JSON 被实际解析:

front.key: header
front.front.key: version
front.front.value:  42

请注意,我必须使用 std::string 来获取 header.version 值,因为尝试使用 get_value_optional<int>() 也会崩溃。

但是,这种手动方法不可扩展;我的应用程序需要接受几个更复杂的 JSON 结构。

当我尝试更复杂的 JSON 字符串时,它们也被成功解析,但使用 get() 方法访问值同样失败,这次使程序崩溃。这是我从崩溃中提取的 GDB 回溯之一,但我对 Boost 不够熟悉,无法从中获得任何有用的信息:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffebfff700 (LWP 7176)]
0x00007ffff5aa8200 in std::locale::locale(std::locale const&) () from /lib64/libstdc++.so.6
Missing separate debuginfos, use: debuginfo-install boost-system-1.53.0-28.el7.x86_64 boost-thread-1.53.0-28.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_7.2.x86_64 libattr-2.4.46-13.el7.x86_64 libcap-2.22-10.el7.x86_64 libcom_err-1.42.9-16.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 libstdc++-4.8.5-39.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 systemd-libs-219-67.el7_7.2.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x00007ffff5aa8200 in std::locale::locale(std::locale const&) () from /lib64/libstdc++.so.6
#1  0x00007ffff5ab6051 in std::basic_ios<char, std::char_traits<char> >::imbue(std::locale const&) () from /lib64/libstdc++.so.6
#2  0x000000000041e322 in boost::property_tree::stream_translator<char, std::char_traits<char>, std::allocator<char>, int>::get_value(std::string const&) ()
#3  0x000000000041c5b2 in boost::optional<int> boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >::get_value_optional<int, boost::property_tree::stream_translator<char, std::char_traits<char>, std::allocator<char>, int> >(boost::property_tree::stream_translator<char, std::char_traits<char>, std::allocator<char>, int>) const ()
#4  0x000000000041aa61 in boost::enable_if<boost::property_tree::detail::is_translator<boost::property_tree::stream_translator<char, std::char_traits<char>, std::allocator<char>, int> >, int>::type boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >::get_value<int, boost::property_tree::stream_translator<char, std::char_traits<char>, std::allocator<char>, int> >(boost::property_tree::stream_translator<char, std::char_traits<char>, std::allocator<char>, int>) const ()
#5  0x000000000041985d in int boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >::get_value<int>() const ()
#6  0x0000000000418673 in int boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >::get<int>(boost::property_tree::string_path<std::string, boost::property_tree::id_translator<std::string> > const&) const ()
#7  0x0000000000414f4a in parseJsonMessage ()
#8  0x000000000040d8cd in ProcessThread () at ../../src/Processing.c:906
#9  0x00007ffff7bc6ea5 in start_thread () from /lib64/libpthread.so.0
#10 0x00007ffff55538cd in clone () from /lib64/libc.so.6

FWIW,我尝试将这段代码放入一个简单的(单线程)main.cpp:

#include <iostream>
#include <sstream>
#include <string>

#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>

using boost::property_tree::ptree;
using boost::property_tree::read_json;
using boost::property_tree::write_json;

int main(int numArgs, char * const * const args) {
    
    ptree tree;

    try {
        // Create an istringstream from the JSON message.
        std::string jsonMessageString("{\"header\":{\"version\":42,\"source\":1,\"destination\":2},\"coffee\":\"colombian\"}");
        std::istringstream isStreamJson(jsonMessageString);

        // Parse the JSON into the property tree.
        std::cout << "Reading JSON..." << jsonMessageString << "...";
        read_json(isStreamJson, tree);
        std::cout << " Done!" << std::endl;
        // Print what we parsed.
        std::cout << "version: " << tree.get<int>("header.version") << std::endl;
        std::cout << "source: " << tree.get<int>("header.source") << std::endl;
        std::cout << "coffee: " << tree.get<std::string>("coffee") << std::endl;
    }
    catch (boost::property_tree::ptree_bad_path badPathException) {
        std::cout << "Exception caught for bad path: " << badPathException.what() << std::endl;
        return -1;
    }
    catch (boost::property_tree::ptree_bad_data badDataException) {
        std::cout << "Exception caught for bad data: " << badDataException.what() << std::endl;
        return -1;
    }
    catch (std::exception exception) {
        std::cout << "Exception caught when parsing message into Boost.Property tree: " << exception.what() << std::endl;
        return -1;
    }
    std::cout << "Program completed!" << std::endl;
    return 0;
}

此代码工作正常:

bash-4.2$ g++ -std=c++11 main.cpp -o main.exe
bash-4.2$ ./main.exe 
Reading JSON...{"header":{"version":42,"source":1,"destination":2},"coffee":"colombian"}... Done!
version: 42
source: 1
coffee: colombian
Program completed!

那么,为什么 Boost.PropertyTree get() 方法不能用于多线程应用程序?应用程序是 C 和 C++ 代码的混合会导致问题吗?我看到我的特定编译器版本 (GCC 4.8.5) 尚未 explicitly verified 与此 Boost 库一起使用...这可能是编译器问题吗?还是 Boost 1.53 版本有问题?


根据提供的答案更新:

不可否认,我的 parseJsonMessage 方法的原始代码很混乱(数十次调试迭代和删除与问题无关的代码的产物)。下面是一个更简洁的版本,没有干扰(和可能的转移注意力):

#ifdef __cplusplus
extern "C"
{
#endif

#include "DirectIpRev3.hpp"

#ifdef __cplusplus
}
#endif

#include <iostream>
#include <sstream>
#include <string>

#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>

using boost::property_tree::ptree;
using boost::property_tree::read_json;
using boost::property_tree::write_json;

extern "C" void parseJsonMessage2() {
    // Create empty property tree object.
    ptree tree;
    std::string jsonMessageString("{\"header\":{\"version\":42,\"source\":1,\"destination\":2},\"coffee\":\"colombian\"}");   //doesn't work
    std::istringstream isStreamJson(jsonMessageString);
    try {
        read_json(isStreamJson, tree);
        std::cout << tree.get<int>("header.version") << std::endl;
        std::cout << tree.get<int>("header.source") << std::endl;
        std::cout << tree.get<std::string>("coffee") << std::endl;
    }
    catch (boost::property_tree::ptree_bad_path const & badPathException) {
        std::cerr << "Exception caught for bad path: " << badPathException.what() << std::endl;
    }
    catch (boost::property_tree::ptree_bad_data const & badDataException) {
        std::cerr << "Exception caught for bad data: " << badDataException.what() << std::endl;
    }
    catch (std::exception const & exception) {
        std::cerr << "Exception caught when parsing message into Boost.Property tree: " << exception.what() << std::endl;
    }
}

运行 我的多线程程序中的这个压缩函数产生异常:

Exception caught when parsing message into Boost.Property tree: <unspecified file>(1): expected object or array

没有异常处理,它会打印更多信息:

terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::property_tree::json_parser::json_parser_error> >'
  what():  <unspecified file>(1): expected object or array

我仍然不太确定是什么导致了这里的失败,但倾向于按照建议使用 nlohmann

请不要使用 属性 树来“解析”“JSON”。参见 nlohmann or Boost.JSON

进一步

  • 您使用原始 newdelete 显然没有充分的理由
  • 您有未使用的参数
  • 你正在按值捕获多态异常
  • 任何异常都会导致内存泄漏,并在出现分配错误时返回空指针

结合这些,我 99% 确定您的崩溃是由其他原因引起的:Undefined Behaviour 内存损坏后倾向于出现在其他地方(例如堆栈抖动或删除后使用,out -边界等)。

使用我的 Crystal 球

  1. 一个猜测:你没有显示,但结构可能看起来像

    typedef struct MyStructT {
        int version;
        int source;
        char const* coffee;
    } MyStruct;
    

    一个天真的错误是分配 coffee 与打印它的方式相同:

    myStruct->coffee = tree.get<std::string>("coffee").c_str();
    

    这里的“明显”(?)问题是 c_str() 指向值节点拥有的内存,并由 ptree 传递。当函数 returns 该指针已过时。哎呀。 UB

  2. 您正在使用 new 分配结构(即使由于 extern "C" 可能是 POD,所以它给您一种错误的安全感,因为所有无论如何,成员都有不确定的值。

    另一个天真的错误是传递使用 ::free 取消分配的 C 代码(就像它对所有 malloc-ed 所做的一样,对)。这是 UB 的另一个潜在来源。

  3. 如果您“确定”了第一个想法,例如使用 strdup,您可能 运行 会遇到内存泄漏更多的问题。即使您正确使用 delete myStruct(或开始使用 malloc),您也必须记住 ::freestrdup.

    分配的字符串
  4. 你的 API 是典型的 C 风格(这可能是故意的)但是为传递错误的 messageLength 导致越界读取敞开了大门.由于观察到您甚至没有使用上面自己的示例代码中的参数,因此发生这种情况的可能性增加了。

多线程压力测试

这是 Coliru 上的多线程压力测试。它在 25 个线程上进行 1000 次迭代。

Live On Coliru

#ifdef __cplusplus
extern "C"
{
#endif

typedef struct MyStructT {
    int version;
    int source;
    char* coffee;
} MyStruct;

//#include "ParseJson.hpp"

#ifdef __cplusplus
}
#endif

#include <iostream>
#include <sstream>
#include <string>

#define BOOST_BIND_GLOBAL_PLACEHOLDERS
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>

using boost::property_tree::ptree;
using boost::property_tree::read_json;
using boost::property_tree::write_json;

extern "C" MyStruct* parseJsonMessage(char const* jsonMessage, unsigned int const messageLength) {
    auto myStruct = std::make_unique<MyStruct>(); // make it exception safe
    // Create empty property tree object.
    ptree tree;

    if (myStruct != nullptr) {
        try {
            // Create an istringstream from the JSON message.
            std::istringstream isStreamJson(std::string(jsonMessage, messageLength));

            // Parse the JSON into the property tree.
            //std::cout << "Reading JSON ..." << isStreamJson.str() << "...";
            read_json(isStreamJson, tree);
            //std::cout << " Done!" << std::endl;

            // Get the values from the property tree.
            myStruct->version = tree.get<int>("header.version");
            myStruct->source = tree.get<int>("header.source");
            myStruct->coffee = ::strdup(tree.get<std::string>("coffee").c_str());
            return myStruct.release();
        }
        catch (boost::property_tree::ptree_bad_path const& badPathException) {
            std::cerr << "Exception caught for bad path: " << badPathException.what() << std::endl;
        }
        catch (boost::property_tree::ptree_bad_data const& badDataException) {
            std::cerr << "Exception caught for bad data: " << badDataException.what() << std::endl;
        }
        catch (std::exception const& exception) {
            std::cerr << "Exception caught when parsing message into Boost.Property tree: " << exception.what() << std::endl;
        }
    }
    return nullptr;
}

#include <cstdlib>
#include <string>
#include <thread>
#include <list>

int main() {
    static std::string_view msg = R"({"header":{"version":42,"source":1,"destination":2},"coffee":"colombian"})";

    auto task = [] {
        for (auto i = 1000; --i;) {
            auto s = parseJsonMessage(msg.data(), msg.size());

            ::printf("version: %d\n", s->version);
            ::printf("source: %d\n", s->source);
            ::printf("coffee: %s\n", s->coffee);

            ::free(s->coffee);
            delete s; // not ::free!
        }
    };

    std::list<std::thread> pool;

    for (int i = 0; i < 25; ++i)
        pool.emplace_back(task);

    for (auto& t : pool)
        t.join();
}

输出(排序和唯一化):

  24975 coffee: colombian
  24975 source: 1
  24975 version: 42