元气解析空字

Parsing null character with boost spirit qi

我正在尝试解析一个带有提升精神气的字符串,它具有以下形式:

"[=10=]help@masonlive.gmu.edu[=10=]test\r\n"

使用以下语法: 这是 hpp:

class EmailGrammar :
    public boost::spirit::qi::grammar< const boost::uint8_t*,
        boost::tuple< boost::iterator_range< const boost::uint8_t*>,
                      boost::iterator_range< const boost::uint8_t*> >()>
{
public:
    const static EmailGrammar instance;

    EmailGrammar ();    

    /* omitting uninteresting stuff here i.e. constructors and assignment */

private:
    boost::spirit::qi::rule< const boost::uint8_t*,
        boost::tuple<
            boost::iterator_range< const boost::uint8_t*>,
            boost::iterator_range< const boost::uint8_t* >()> m_start;
};

语法的 cpp 如下所示:

EmailGrammar::EmailGrammar() :
    EmailGrammar::base_type(m_start),
    m_start()
{
    namespace qi = boost::spirit::qi;
    m_start = 
             (
             qi::lit('[=12=]')
             >> (
                    qi::raw[*(qi::char_ - qi::lit('[=12=]'))]
                )
             >> qi::lit('[=12=]')
             >> (
                    qi::raw[*(qi::char_ - qi::eol)]
                )
             >> qi::eol >> qi::eoi
             );
}

我打算用它来解析这两个字符串并将它们分成两个单独的迭代器范围。

然后这样调用:

int main()
{
    typedef typename EmailGrammar::start_type::attr_type attr;

    std::string testStr("[=13=]help@masonlive.gmu.edu[=13=]test\r\n");

    // this is not done this way in the real code just as a test
    boost::iterator_range<const boost::uint8_t*> data =
        boost::make_iterator_range(
            reinterpret_cast< const boost::uint8_t* >(testStr.data()),
            reinterpret_cast< const boost::uint8_t* >(testStr.data() + testStr.length()));

    attr exposedAttribute;
    if (boost::spirit::qi::parse(data.begin(),
                                 data.end(),
                                 EmailGrammar::instance,
                                 exposedAttribute)
    {
        std::cout << "success" << std::endl;
    }
}

问题似乎出在解析空终止符上。我认为这是因为当我将 debug(m_rule); 添加到代码中时,我得到了 xml 输出:

<unnamed-rule>
<try></try>
<fail/>
</unnamed-rule>

不过。例如,如果我明确删除第一个空终止符,我会得到输出:

<unnamed-rule>
<try>help@masonlive.gmu.e</try>
<fail/>
</unnamed-rule>

这导致了问题:

整个问题很可能起源于此:

std::string testStr("[=10=]help@masonlive.gmu.edu[=10=]test\r\n");

并没有按照你的想法去做。它创建一个空字符串。相反,指定原始 literal/buffer:

的长度
std::string testStr("[=11=]help@masonlive.gmu.edu[=11=]test\r\n", 31);

奖金

如果你不想做 math/counting(你不应该!),做一个帮手:

template <typename Char, size_t N>
std::string bake(Char const (&p)[N], bool include_terminator = false) {
    return { p, p + N - (include_terminator?0:1) };
}

你可以像这样使用:

std::string const testStr = bake("[=13=]help@masonlive.gmu.edu[=13=]test\r\n");

Live On Coliru

#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/adapted/boost_tuple.hpp>
namespace qi = boost::spirit::qi;

using It = uint8_t const*;
using Range = boost::iterator_range<It>;
using Attribute = boost::tuple<Range, Range>;

class EmailGrammar : public qi::grammar<It, Attribute()> {
  public:
    const static EmailGrammar instance;

    EmailGrammar() : EmailGrammar::base_type(m_start)
    {
        using namespace qi;

        m_start = 
            '[=14=]' >> raw[*(char_ - '[=14=]')] >> 
            '[=14=]' >> raw[*(char_ - eol)] >> 
            eol >> eoi
            ;

        BOOST_SPIRIT_DEBUG_NODES((m_start))
    }

  private:
    qi::rule<It, Attribute()> m_start;
};

const EmailGrammar EmailGrammar::instance {};

template <typename Char, size_t N>
std::string bake(Char const (&p)[N], bool include_terminator = false) {
    return { p, p + N - (include_terminator?0:1) };
}

int main() {
    std::string const testStr = bake("[=14=]help@masonlive.gmu.edu[=14=]test\r\n");

    It f = reinterpret_cast<It>(testStr.data()),
       l = f + testStr.length();

    Attribute exposedAttribute;
    if (boost::spirit::qi::parse(f, l, EmailGrammar::instance, exposedAttribute)) {
        std::cout << "success" << std::endl;
    }
}

版画

<m_start>
  <try></try>
  <success></success>
  <attributes>[[[h, e, l, p, @, m, a, s, o, n, l, i, v, e, ., g, m, u, ., e, d, u], [t, e, s, t]]]</attributes>
</m_start>
success