Spirit.X3:将本地数据传递给解析器

Spirit.X3: passing local data to a parser

Boost.Spirit 文档中的示例似乎分为两种情况:

1/ 在函数中定义解析器:语义操作可以访问局部变量和数据,因为它们是局部 lambda。在这里点赞 push_backhttps://www.boost.org/doc/libs/master/libs/spirit/doc/x3/html/spirit_x3/tutorials/number_list___stuffing_numbers_into_a_std__vector.html

2/ 在命名空间中定义解析器,如下所示:https://www.boost.org/doc/libs/1_69_0/libs/spirit/doc/x3/html/spirit_x3/tutorials/minimal.html

这似乎是调用 BOOST_SPIRIT_DEFINE 所必需的。

我的问题是:如何将两者结合起来(正确地,没有全局变量)?我的梦想 API 是将一些参数传递给 phrase_parse 然后做一些 x3::_arg(ctx) 但我找不到这样的东西。

例如我的解析器:现在操作正在写入 std::cerr。如果我想改为写入自定义 std::ostream&,那将被传递给 parse 函数怎么办?

using namespace boost::spirit;
using namespace boost::spirit::x3;

rule<struct id_action> action = "action";
rule<struct id_array> array = "array";
rule<struct id_empty_array> empty_array = "empty_array";
rule<struct id_atom> atom = "atom";
rule<struct id_sequence> sequence = "sequence";
rule<struct id_root> root = "root";

auto access_index_array = [] (const auto& ctx) { std::cerr << "access_array: " << x3::_attr(ctx) << "\n" ;};
auto access_empty_array = [] (const auto& ctx) { std::cerr << "access_empty_array\n" ;};
auto access_named_member = [] (const auto& ctx) { std::cerr << "access_named_member: " << x3::_attr(ctx) << "\n" ;};
auto start_action = [] (const auto& ctx) { std::cerr << "start action\n" ;};
auto finish_action = [] (const auto& ctx) { std::cerr << "finish action\n" ;};
auto create_array = [] (const auto& ctx) { std::cerr << "create_array\n" ;};

const auto action_def = +(lit('.')[start_action]
                      >> -((+alnum)[access_named_member])
                      >> *(('[' >> x3::int_ >> ']')[access_index_array] | lit("[]")[access_empty_array]));
const auto sequence_def = (action[finish_action] % '|');
const auto array_def = ('[' >> sequence >> ']')[create_array];
const auto root_def = array | action;

BOOST_SPIRIT_DEFINE(action)
BOOST_SPIRIT_DEFINE(array)
BOOST_SPIRIT_DEFINE(sequence)
BOOST_SPIRIT_DEFINE(root)

bool parse(std::string_view str)
{
  using ascii::space;
  auto first = str.begin();
  auto last = str.end();
  bool r = phrase_parse(
             first, last,
             parser::array_def | parser::sequence_def,
             ascii::space
  );

  if (first != last)
    return false;
  return r;
}

关于方法:

1/ 是的,这对于小型、包含的解析器是可行的。通常仅在单个 TU 中使用,并通过 non-generic 接口公开。

2/ 这是用于(很多)更大语法的方法,您可能希望跨 TU 分布,and/or 通常在多个 TU 中实例化。

请注意,您不需要 BOOST_SPIRIT_DEFINE,除非您

  • 有递归规则
  • 想要将声明与定义分开。 [这变得相当复杂,我建议不要将其用于 X3。]

问题

My question is: how to combine both (properly, without globals) ?

如果其中一项要求是“没有全局变量”,则不能将某些内容与名称空间级别的声明结合起来。

My dream API would be to pass some argument to phrase_parse and then do some x3::_arg(ctx) but I couldn't find anything like this.

我不知道你认为 x3::_arg(ctx) 在那个特定的梦中会做什么:)

Here is for instance my parser: for now the actions are writing to std::cerr. What if I wanted to write to a custom std::ostream& instead, that would be passed to the parse function?

现在这是一个具体的问题。我会说:使用上下文。

您可以做到这一点,以便您可以使用 x3::get<ostream>(ctx) returns 流:

struct ostream{};

auto access_index_array  = [] (const auto& ctx) { x3::get<ostream>(ctx) << "access_array: " << x3::_attr(ctx) << "\n" ;};
auto access_empty_array  = [] (const auto& ctx) { x3::get<ostream>(ctx) << "access_empty_array\n" ;};
auto access_named_member = [] (const auto& ctx) { x3::get<ostream>(ctx) << "access_named_member: " <<  x3::_attr(ctx) << "\n" ;};
auto start_action        = [] (const auto& ctx) { x3::get<ostream>(ctx) << "start action\n" ;};
auto finish_action       = [] (const auto& ctx) { x3::get<ostream>(ctx) << "finish action\n" ;};
auto create_array        = [] (const auto& ctx) { x3::get<ostream>(ctx) << "create_array\n";};

现在您需要在解析时将标记的参数放入上下文中:

bool r = phrase_parse(
    f, l,
    x3::with<parser::ostream>(std::cerr)[parser::array_def | parser::sequence_def],
    x3::space);

现场演示:http://coliru.stacked-crooked.com/a/a26c8eb0af6370b9

版画

start action
access_named_member: a
finish action
start action
access_named_member: b
start action
start action
access_array: 2
start action
access_named_member: foo
start action
access_empty_array
finish action
start action
access_named_member: c
finish action
create_array
true

与标准 X3 调试输出混合:

<sequence>
  <try>.a|.b..[2].foo.[]|.c</try>
  <action>
    <try>.a|.b..[2].foo.[]|.c</try>
    <success>|.b..[2].foo.[]|.c]</success>
  </action>
  <action>
    <try>.b..[2].foo.[]|.c]</try>
    <success>|.c]</success>
  </action>
  <action>
    <try>.c]</try>
    <success>]</success>
  </action>
  <success>]</success>
</sequence>

等等 #1 - 事件处理程序

您似乎在解析类似于 JSON 指针或 jq 语法的内容。如果您想提供 callback-interface (SAX-events),为什么不绑定回调接口而不是操作:

struct handlers {
    using N = x3::unused_type;
    virtual void index(int) {}
    virtual void index(N) {}
    virtual void property(std::string) {}
    virtual void start(N) {}
    virtual void finish(N) {}
    virtual void create_array(N) {}
};

#define EVENT(e) ([](auto& ctx) { x3::get<handlers>(ctx).e(x3::_attr(ctx)); })

const auto action_def =
    +(x3::lit('.')[EVENT(start)] >> -((+x3::alnum)[EVENT(property)]) >>
      *(('[' >> x3::int_ >> ']')[EVENT(index)] | x3::lit("[]")[EVENT(index)]));

const auto sequence_def = action[EVENT(finish)] % '|';
const auto array_def    = ('[' >> sequence >> ']')[EVENT(create_array)];
const auto root_def     = array | action;

现在您可以在一个界面中整齐地实现所有处理程序:

struct default_handlers : parser::handlers {
    std::ostream& os;
    default_handlers(std::ostream& os) : os(os) {}

    void index(int i) override            { os << "access_array: " << i << "\n";          };
    void index(N) override                { os << "access_empty_array\n" ;                };
    void property(std::string n) override { os << "access_named_member: " <<  n << "\n" ; };
    void start(N) override                { os << "start action\n" ;                      };
    void finish(N) override               { os << "finish action\n" ;                     };
    void create_array(N) override         { os << "create_array\n";                       };
};

auto f = str.begin(), l = str.end();
bool r = phrase_parse(f, l,
                      x3::with<parser::handlers>(default_handlers{std::cout}) //
                          [parser::array_def | parser::sequence_def],
                      x3::space);

再看一遍Live On Coliru:

start action
access_named_member: a
finish action
start action
access_named_member: b
start action
start action
access_array: 2
start action
access_named_member: foo
start action
access_empty_array
finish action
start action
access_named_member: c
finish action
create_array
true

但是等等 #2 - 没有操作

公开属性的自然方式是构建 AST。另见 Boost Spirit: "Semantic actions are evil"?

事不宜迟:

namespace AST {
    using Id = std::string;
    using Index = int;
    struct Member {
        std::optional<Id> name;
    };
    struct Indexer {
        std::optional<int> index;
    };
    struct Action {
        Member member;
        std::vector<Indexer> indexers;
    };

    using Actions = std::vector<Action>;
    using Sequence = std::vector<Actions>;

    struct ArrayCtor {
        Sequence actions;
    };

    using Root = boost::variant<ArrayCtor, Actions>;
}

当然,我是在做一些假设。规则可以大大简化:

namespace parser {
    template <typename> struct Tag {};
    #define AS(T, p) (x3::rule<Tag<AST::T>, AST::T>{#T} = p)

    auto id       = AS(Id, +x3::alnum);
    auto member   = AS(Member, x3::lit('.') >> -id);
    auto indexer  = AS(Indexer,'[' >> -x3::int_ >> ']');

    auto action   = AS(Action, member >> *indexer);
    auto actions  = AS(Actions, +action);

    auto sequence = AS(Sequence, actions % '|');
    auto array    = AS(ArrayCtor, '[' >> -sequence >> ']'); // covers empty array
    auto root     = AS(Root, array | actions);
} // namespace parser

和解析函数returns AST:

AST::Root parse(std::string_view str) {
    auto f = str.begin(), l = str.end();

    AST::Root parsed;
    phrase_parse(f, l, x3::expect[parser::root >> x3::eoi], x3::space, parsed);

    return parsed;
}

(请注意,如果输入无效或未完全解析,它现在会抛出 x3::expection_failure

int main() {
    std::cout << parse("[.a|.b..[2].foo.[]|.c]");
}

现在打印:

[.a|.b./*none*/./*none*/[2].foo./*none*/[/*none*/]|.c]

看到了Live On Coliru

//#define BOOST_SPIRIT_X3_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
#include <ostream>
#include <optional>

namespace x3 = boost::spirit::x3;

namespace AST {
    using Id = std::string;
    using Index = int;
    struct Member {
        std::optional<Id> name;
    };
    struct Indexer {
        std::optional<int> index;
    };
    struct Action {
        Member member;
        std::vector<Indexer> indexers;
    };

    using Actions = std::vector<Action>;
    using Sequence = std::vector<Actions>;

    struct ArrayCtor {
        Sequence actions;
    };

    using Root = boost::variant<ArrayCtor, Actions>;
}

BOOST_FUSION_ADAPT_STRUCT(AST::Member, name)
BOOST_FUSION_ADAPT_STRUCT(AST::Indexer, index)
BOOST_FUSION_ADAPT_STRUCT(AST::Action, member, indexers)
BOOST_FUSION_ADAPT_STRUCT(AST::ArrayCtor, actions)

namespace parser {
    template <typename> struct Tag {};
    #define AS(T, p) (x3::rule<Tag<AST::T>, AST::T>{#T} = p)

    auto id       = AS(Id, +x3::alnum);
    auto member   = AS(Member, x3::lit('.') >> -id);
    auto indexer  = AS(Indexer,'[' >> -x3::int_ >> ']');

    auto action   = AS(Action, member >> *indexer);
    auto actions  = AS(Actions, +action);

    auto sequence = AS(Sequence, actions % '|');
    auto array    = AS(ArrayCtor, '[' >> -sequence >> ']'); // covers empty array
    auto root     = AS(Root, array | actions);
} // namespace parser

AST::Root parse(std::string_view str) {
    auto f = str.begin(), l = str.end();

    AST::Root parsed;
    phrase_parse(f, l, x3::expect[parser::root >> x3::eoi], x3::space, parsed);

    return parsed;
}

// for debug output
#include <iostream>
#include <iomanip>
namespace AST {
    static std::ostream& operator<<(std::ostream& os, Member const& m) {
        return os << "." << m.name.value_or("/*none*/");
    }

    static std::ostream& operator<<(std::ostream& os, Indexer const& i) {
        if (i.index)
            return os << "[" << *i.index << "]";
        else
            return os << "[/*none*/]";
    }

    static std::ostream& operator<<(std::ostream& os, Action const& a) {
        os << a.member;
        for (auto& i : a.indexers)
            os << i;
        return os;
    }

    static std::ostream& operator<<(std::ostream& os, Actions const& aa) {
        for (auto& a : aa)
            os << a;
        return os;
    }

    static std::ostream& operator<<(std::ostream& os, Sequence const& s) {
        bool first = true;
        for (auto& a : s)
            os << (std::exchange(first, false) ? "" : "|") << a;
        return os;
    }

    static std::ostream& operator<<(std::ostream& os, ArrayCtor const& ac) {
        return os << "[" << ac.actions << "]";
    }
}

int main() {
    std::cout << parse("[.a|.b..[2].foo.[]|.c]");
}