Boost spirit 替代运算符未填充所有属性值

Boost spirit alternative operator doesn't fill all attribute values

我使用boost spirit qi从文件中读取实数。我尝试实现条件解析器,其中输入取决于行中的第一个字符。

#include <iostream>
#include <boost/fusion/adapted/struct/adapt_struct.hpp>
#include <boost/spirit/include/qi.hpp>
using namespace std;
namespace qi = boost::spirit::qi;


struct  MyStruct {

   double r1, r2, r3, r4;
   double r5, r6, r7, r8;
};

BOOST_FUSION_ADAPT_STRUCT(
   MyStruct,

   (double, r1), (double, r2), (double, r3), (double, r4),
   (double, r5), (double, r6), (double, r7), (double, r8)
);

int main(int argc, wchar_t* argv[])
{
   string test =
      "A+1.000000000000e+00+2.000000000000e+00+3.000000000000e+00+4.000000000000e+00\r\n"
      "B+5.000000000000e+00+6.000000000000e+00+7.000000000000e+00+8.000000000000e+00\r\n";
   qi::rule<string::const_iterator> CRLF = qi::copy(qi::lit("\r\n"));
   qi::real_parser d19_12;

   MyStruct ms;
   qi::rule<string::const_iterator, MyStruct()> gr =

      qi::lit("A") >> d19_12 >> d19_12 >> d19_12 >> d19_12 >> CRLF
      >> (
         (qi::lit('B') >> d19_12  >> d19_12 >> d19_12 >> d19_12 >> CRLF)
         |
         (qi::lit('C') >> d19_12  >> d19_12 >> d19_12 >> +qi::lit('_') >> qi::attr(0.0) >> CRLF)
         )
      ;
   string::const_iterator f = test.cbegin();
   string::const_iterator e = test.cend();
   bool ret = qi::parse(f, e, gr, ms);

   return ret;
}

在没有 'C' 选项的情况下,一切都按预期工作,但添加此选项会使解析器跳过这些值,结果是

预期结果是:

谢谢

您可以调试规则。因此,将输入简化为 "A+1+2+3+4\r\nB+5+6+7+8\r\n" 并将真正的解析器包装到规则中,这是调试输出:

Live On Coliru

<gr>
  <try>A+1+2+3+4\r\nB+5+6+7+8</try>
  <d19_12>
    <try>+1+2+3+4\r\nB+5+6+7+8\r</try>
    <success>+2+3+4\r\nB+5+6+7+8\r\n</success>
    <attributes>[1]</attributes>
  </d19_12>
  <d19_12>
    <try>+2+3+4\r\nB+5+6+7+8\r\n</try>
    <success>+3+4\r\nB+5+6+7+8\r\n</success>
    <attributes>[2]</attributes>
  </d19_12>
  <d19_12>
    <try>+3+4\r\nB+5+6+7+8\r\n</try>
    <success>+4\r\nB+5+6+7+8\r\n</success>
    <attributes>[3]</attributes>
  </d19_12>
  <d19_12>
    <try>+4\r\nB+5+6+7+8\r\n</try>
    <success>\r\nB+5+6+7+8\r\n</success>
    <attributes>[4]</attributes>
  </d19_12>
  <CRLF>
    <try>\r\nB+5+6+7+8\r\n</try>
    <success>B+5+6+7+8\r\n</success>
    <attributes>[]</attributes>
  </CRLF>
  <d19_12>
    <try>+5+6+7+8\r\n</try>
    <success>+6+7+8\r\n</success>
    <attributes>[5]</attributes>
  </d19_12>
  <d19_12>
    <try>+6+7+8\r\n</try>
    <success>+7+8\r\n</success>
    <attributes>[6]</attributes>
  </d19_12>
  <d19_12>
    <try>+7+8\r\n</try>
    <success>+8\r\n</success>
    <attributes>[7]</attributes>
  </d19_12>
  <d19_12>
    <try>+8\r\n</try>
    <success>\r\n</success>
    <attributes>[8]</attributes>
  </d19_12>
  <CRLF>
    <try>\r\n</try>
    <success></success>
    <attributes>[]</attributes>
  </CRLF>
  <success></success>
  <attributes>[[1, 2, 3, 4, 5, 4.27256e+180, 0, 0]]</attributes>
</gr>
Parsed: (1 2 3 4 5 4.27256e+180 0 0)

确实,它确认所有数字都已解析。为什么属性传播没有按照您的预期进行?

我的猜测是它试图接受比您预期的多一点的属性传播。问题是您的 AST 不直接匹配规则:规则综合

tup4 := tuple<double, double, double, double>
attribute := tuple<tup4, variant<tup4, tup4> >

在 Qi 版本中这确实被简化为 tuple<tup4, tup4> 但你的 AST 实际上就像一个 tup8,这是不一样的。所以在传播时,规则只会做它认为最好的选择,即分配第一个 tup4。然后:耸耸肩:

修复

最简单的解决方法是使您的 AST 符合规则。这实际上可能最有意义,因为 "A""B"、“C”更有可能具有语义含义。

namespace Ast {
    struct A {
        double r1, r2, r3, r4;
    };
    struct BC {
        double r5, r6, r7, r8;
    };
    struct MyStruct {
        A  a;
        BC bc;
    };

    using boost::fusion::operator<<;
} // namespace Ast

正在调整它们:

BOOST_FUSION_ADAPT_STRUCT(Ast::A, r1, r2, r3, r4)
BOOST_FUSION_ADAPT_STRUCT(Ast::BC, r5, r6, r7, r8)
BOOST_FUSION_ADAPT_STRUCT(Ast::MyStruct, a, bc)

Note that, without further changes, this just confirms that automatic attribute propagation is a heuristics--based: Coliru: Parsed: ((1 0 0 0) (2 0 0 0)) (oops)

使规则匹配该结构:

qi::rule<It>         CRLF   = "\r\n";
qi::rule<It, double> d19_12 = qi::double_;

qi::rule<It, Ast::A()>  A  = "A" >> d19_12 >> d19_12 >> d19_12 >> d19_12; //
qi::rule<It, Ast::BC()> BC =                                              //
    'B' >> d19_12 >> d19_12 >> d19_12 >> d19_12 |                         //
    'C' >> d19_12 >> d19_12 >> d19_12 >> +qi::lit('_') >> qi::attr(0.0);

qi::rule<It, Ast::MyStruct()> gr = A >> CRLF >> BC >> CRLF;

现在一切正常:Coliru

版画

Parsed: ((1 2 3 4) (5 6 7 8))

开箱即用

很多这对我来说似乎是 XY 问题。一个包含 8 个可以具有不同含义的非描述性数字的结构似乎...不是您实际需要的。

此外,B/C 区别似乎表明您真的想要一个“可选号码”规则:

rule<It>         CRLF   = "\r\n";
rule<It, double()> d19_12 = raw[ //
    double_[_val = _1] |         //
    omit[+char_("_")]            //
][_pass = px::size(_1) == 19];

rule<It, Ast::Tup4()> Tup4 =
    omit[char_("ABC")] >> d19_12 >> d19_12 >> d19_12 >> d19_12;

注意 omit[char_("ABC")] 如何直接反映我的直觉,即您正在丢弃模型中的语义信息。

现在语法变成了

rule<It, Ast::MyStruct()> gr = Tup4 >> CRLF >> Tup4 >> CRLF;

事实上,它解析了完整的输入:Coliru

Parsed: ((1.0001 2.0002 3.0003 4.0004) (5.0005 6.0006 7.0007 8.0008))

简化!集装箱

事实上,我怀疑像这样的东西可能会更好地为您服务:

namespace Ast {
    using Reals = boost::container::static_vector<double, 8>;
} // namespace Ast

有趣的是,容器 do 享有更灵活的属性传播(使用新的 caveat)。你可以有一些直截了当的东西:

qi::rule<It, Ast::Reals(char const*)> Line =
    qi::omit[qi::char_(_r1)] >> d19_12 >> d19_12 >> d19_12 >> d19_12;

qi::rule<It, Ast::Reals()> gr = //
    Line(+"A") >> CRLF >> Line(+"BC") >> CRLF;

让我用一个这样的活生生的例子来结束:Live On Compiler Explorer¹

//#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/container/static_vector.hpp>
#include <fmt/ranges.h>
#include <iomanip>
#include <iostream>
namespace qi = boost::spirit::qi;
namespace px = boost::phoenix;

namespace Ast {
    using Reals = boost::container::static_vector<double, 8>;
} // namespace Ast

int main()
{
    using It = std::string::const_iterator;
    using namespace qi::labels;

    qi::rule<It>         CRLF   = "\r\n";
    qi::rule<It, double()> d19_12 = qi::raw[ //
        qi::double_[_val = _1] |             //
        qi::omit[+qi::char_("_")]            //
    ][_pass = px::size(_1) == 19];

    qi::rule<It, Ast::Reals(char const*)> Line =
        qi::omit[qi::char_(_r1)] >> d19_12 >> d19_12 >> d19_12 >> d19_12;

    qi::rule<It, Ast::Reals()> gr = //
        Line(+"A") >> CRLF >> Line(+"BC") >> CRLF;

    BOOST_SPIRIT_DEBUG_NODES((gr)(Line)(d19_12)(CRLF))

    for (std::string const test : {
             "A+1.000100000000e+00+2.000200000000e+00+3.000300000000e+00+4.000400000000e+00\r\n"
             "B+5.000500000000e+00+6.000600000000e+00+7.000700000000e+00+8.000800000000e+00\r\n",
             "A+1.000100000000e+00+2.000200000000e+00+3.000300000000e+00+4.000400000000e+00\r\n"
             "C+5.000500000000e+00+6.000600000000e+00+7.000700000000e+00___________________\r\n",
         }) {
        It f = test.cbegin(), e = test.cend();

        Ast::Reals data;
        if (parse(f, e, gr, data)) {
            fmt::print("Parsed: {}\n", data);
        } else {
            fmt::print("Failed\n");
        }

        if (f != e) {
            std::cout << "Remaining: " << std::quoted(std::string(f, e))
                      << "\n";
        }
    }
}

版画

Parsed: {1.0001, 2.0002, 3.0003, 4.0004, 5.0005, 6.0006, 7.0007, 8.0008}
Parsed: {1.0001, 2.0002, 3.0003, 4.0004, 5.0005, 6.0006, 7.0007, 0}

¹ 我懒于输出格式,使用 libfmt 而不是再次编写我的矢量打印 cruft; Coliru 还没有 libfmt(或 c++23)