学习 Boost.Spirit: 解析 THIS

Learning Boost.Spirit: parsing INI

我开始学习 Boost.Spirit 并读完了 Qi - Writing Parsers 部分。读书时,一切都通俗易懂。但是当我尝试做某事时,会出现很多错误,因为包含和命名空间太多,我需要知道什么时候 include/use 它们。作为练习,我想写一个简单的 INI 解析器。

这是代码(包括来自 Spirit lib 中的一个示例,几乎所有其他内容):

#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/include/phoenix_object.hpp>

#include <iostream>
#include <string>
#include <vector>
#include <map>

namespace client
{
    typedef std::map<std::string, std::string> key_value_map_t;

    struct mini_ini
    {
        std::string name;
        key_value_map_t key_values_map;
    };
} // client

BOOST_FUSION_ADAPT_STRUCT(
    client::mini_ini,
    (std::string, name)
    (client::key_value_map_t, key_values_map)
)

namespace client
{
    namespace qi = boost::spirit::qi;
    namespace ascii = boost::spirit::ascii;
    namespace phoenix = boost::phoenix;

    template <typename Iterator>
    struct ini_grammar : qi::grammar<Iterator, mini_ini(), ascii::space_type>
    {
        ini_grammar() : ini_grammar::base_type(section_, "section")
        {
            using qi::char_;
            using qi::on_error;
            using qi::fail;
            using namespace qi::labels;
            using phoenix::construct;
            using phoenix::val;

            key_ = +char_("a-zA-Z_0-9");
            pair_ = key_ >> '=' >> *char_;
            section_ = '[' >> key_ >> ']' >> '\n' >> *(pair_ >> '\n');

            key_.name("key");
            pair_.name("pair");
            section_.name("section");

            on_error<fail>
            (
                section_
              , std::cout
                    << val("Error! Expecting ")
                    << _4                               // what failed?
                    << val(" here: \"")
                    << construct<std::string>(_3, _2)   // iterators to error-pos, end
                    << val("\"")
                    << std::endl
            );
        }

        qi::rule<Iterator, std::string(), ascii::space_type> key_;
        qi::rule<Iterator, mini_ini(), ascii::space_type> section_;
        qi::rule<Iterator, std::pair<std::string, std::string>(), ascii::space_type> pair_;
    };
} // client

int
main()
{
    std::string storage =
        "[section]\n"
        "key1=val1\n"
        "key2=val2\n";
    client::mini_ini ini;
    typedef client::ini_grammar<std::string::const_iterator> ini_grammar;
    ini_grammar grammar;

    using boost::spirit::ascii::space;
    std::string::const_iterator iter = storage.begin();
    std::string::const_iterator end = storage.end();
    bool r = phrase_parse(iter, end, grammar, space, ini);

    if (r && iter == end)
    {
        std::cout << "-------------------------\n";
        std::cout << "Parsing succeeded\n";
        std::cout << "-------------------------\n";

        return 0;
    }
    else
    {
        std::cout << "-------------------------\n";
        std::cout << "Parsing failed\n";
        std::cout << "-------------------------\n";
        std::cout << std::string(iter, end) << "\n";
        return 1;
    }

    return 0;
}

如你所见,我想将下一个文本解析为 mini_ini 结构:

"[section]"
"key1=val1"
"key2=val2";

我失败了,std::string(iter, end) 是完整的输入字符串。

我的问题:

谢谢

Q. Why I see fail but don't see on_error handler

on_error 处理程序仅针对已注册的规则 (section_) 触发,并且如果 expectation point is failed.

您的语法不包含期望点(仅使用 >>,未使用 >)。

Q. Have you any recommendations how to learn Boost.Spirit (I have good understanding of documentation in theory, but in practice I have a lot of WHY ???) ?

只需构建您需要的解析器即可。从文档和 SO 答案中复制好的约定。他们有很多。如您所见,有相当多的文件包含了具有不同级别错误报告的 Ini 解析器的完整示例。

奖金提示:

做更详细的状态报告:

bool ok = phrase_parse(iter, end, grammar, space, ini);

if (ok) {
    std::cout << "Parse success\n";
} else {
    std::cout << "Parse failure\n";
}

if (iter != end) {
    std::cout << "Remaining unparsed: '" << std::string(iter, end) << "'\n";
}

return ok && (iter==end)? 0 : 1;

使用BOOST_SPIRIT_DEBUG:

#define BOOST_SPIRIT_DEBUG

// and later
BOOST_SPIRIT_DEBUG_NODES((key_)(pair_)(section_))

打印:

<section_>
  <try>[section]\nkey1=val1\n</try>
  <key_>
    <try>section]\nkey1=val1\nk</try>
    <success>]\nkey1=val1\nkey2=val</success>
    <attributes>[[s, e, c, t, i, o, n]]</attributes>
  </key_>
  <fail/>
</section_>
Parse failure
Remaining unparsed: '[section]
key1=val1
key2=val2
'

您会注意到 header 部分未被解析,因为换行符不匹配。您的船长 (space_type) 跳过 换行符,因此它永远不会匹配:Boost spirit skipper issues

修复船长

当使用 blank_type 作为船长时,您将获得成功的解析:

<section_>
<try>[section]\nkey1=val1\n</try>
<key_>
    <try>section]\nkey1=val1\nk</try>
    <success>]\nkey1=val1\nkey2=val</success>
    <attributes>[[s, e, c, t, i, o, n]]</attributes>
</key_>
<pair_>
    <try>key1=val1\nkey2=val2\n</try>
    <key_>
    <try>key1=val1\nkey2=val2\n</try>
    <success>=val1\nkey2=val2\n</success>
    <attributes>[[k, e, y, 1]]</attributes>
    </key_>
    <success></success>
    <attributes>[[[k, e, y, 1], [v, a, l, 1, 
, k, e, y, 2, =, v, a, l, 2, 
]]]</attributes>
</pair_>
<success>key1=val1\nkey2=val2\n</success>
<attributes>[[[s, e, c, t, i, o, n], []]]</attributes>
</section_>
Parse success
Remaining unparsed: 'key1=val1
key2=val2

NOTE: The parse succeeds but doesn't do what you want. This is because *char_ includes newlines. So make that

       pair_ = key_ >> '=' >> *(char_ - qi::eol); // or
       pair_ = key_ >> '=' >> *~char_("\r\n"); // etc

完整代码

Live On Coliru

#define BOOST_SPIRIT_DEBUG
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/include/phoenix_object.hpp>

#include <iostream>
#include <string>
#include <vector>
#include <map>

namespace client
{
    typedef std::map<std::string, std::string> key_value_map_t;

    struct mini_ini
    {
        std::string name;
        key_value_map_t key_values_map;
    };
} // client

BOOST_FUSION_ADAPT_STRUCT(
    client::mini_ini,
    (std::string, name)
    (client::key_value_map_t, key_values_map)
)

namespace client
{
    namespace qi      = boost::spirit::qi;
    namespace ascii   = boost::spirit::ascii;
    namespace phoenix = boost::phoenix;

    template <typename Iterator>
    struct ini_grammar : qi::grammar<Iterator, mini_ini(), ascii::blank_type>
    {
        ini_grammar() : ini_grammar::base_type(section_, "section")
        {
            using qi::char_;
            using qi::on_error;
            using qi::fail;
            using namespace qi::labels;
            using phoenix::construct;
            using phoenix::val;

            key_ = +char_("a-zA-Z_0-9");
            pair_ = key_ >> '=' >> *char_;
            section_ = '[' >> key_ >> ']' >> '\n' >> *(pair_ >> '\n');

            BOOST_SPIRIT_DEBUG_NODES((key_)(pair_)(section_))

            on_error<fail>
            (
                section_
              , std::cout
                    << val("Error! Expecting ")
                    << _4                               // what failed?
                    << val(" here: \"")
                    << construct<std::string>(_3, _2)   // iterators to error-pos, end
                    << val("\"")
                    << std::endl
            );
        }

        qi::rule<Iterator, std::string(), ascii::blank_type> key_;
        qi::rule<Iterator, mini_ini(), ascii::blank_type> section_;
        qi::rule<Iterator, std::pair<std::string, std::string>(), ascii::blank_type> pair_;
    };
} // client

int
main()
{
    std::string storage =
        "[section]\n"
        "key1=val1\n"
        "key2=val2\n";
    client::mini_ini ini;
    typedef client::ini_grammar<std::string::const_iterator> ini_grammar;
    ini_grammar grammar;

    using boost::spirit::ascii::blank;
    std::string::const_iterator iter = storage.begin();
    std::string::const_iterator end = storage.end();
    bool ok = phrase_parse(iter, end, grammar, blank, ini);

    if (ok) {
        std::cout << "Parse success\n";
    } else {
        std::cout << "Parse failure\n";
    }

    if (iter != end) {
        std::cout << "Remaining unparsed: '" << std::string(iter, end) << "'\n";
    }

    return ok && (iter==end)? 0 : 1;
}