具有boost精神的抽象语法树的迭代更新

Iterative update of abstract syntax tree with boost spirit

我有一个可用的 boost spirit 解析器,我在想是否可以使用 boost spirit 对抽象语法树进行迭代更新?

我的结构类似于:

  struct ast;

  typedef boost::variant< boost::recursive_wrapper<ast> > node;

  struct ast
  {
    std::vector<int> value;
    std::vector<node> children;
  };

正在使用以下方法解析:

bool r = phrase_parse(begin, end, grammar, space, ast);

能不能用boost精神做抽象语法树的迭代更新?我没有找到关于此的任何文档,但我在想解析器语义操作是否可以 push_back 在已经存在的 AST 上。有人试过吗?

这将允许像这样进行解析:

bool r = phrase_parse(begin, end, grammar, space, ast); //initial parsing

//the second parse will be called at a later state given some event/timer/io/something

bool r = phrase_parse(begin, end, grammar, space, ast); //additional parsing which will update the already existing AST

您如何知道要合并哪些节点?或者您总是在根级别添加 ("graft") 吗?在那种情况下,你为什么不解析另一个并将元素合并到现有的 ast 中?

ast& operator+=(ast&& other) {
    std::move(other.value.begin(),    other.value.end(),    back_inserter(value));
    std::move(other.children.begin(), other.children.end(), back_inserter(children));
    return *this;
}

演示时间

让我们为这个 AST 设计我能想到的最简单的语法:

start = '{' >> -(int_ % ',') >> ';' >> -(start % ',') >> '}';

请注意,我什至没有将 ; 设为可选。那好吧。样品。供读者练习。 ☡ 你懂的.

我们实现了简单的功能ast parse(It f, It l),然后我们可以简单地合并asts:

int main() {
    ast merged;
    for(std::string const& input : {
                "{1 ,2 ,3 ;{4 ;{9 , 8 ;}},{5 ,6 ;}}",
                "{10,20,30;{40;{90, 80;}},{50,60;}}",
            })
    {
        merged += parse(input.begin(), input.end());
        std::cout << "merged + " << input << " --> " << merged << "\n";
    }
}

Live On Coliru

//#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>

namespace qi = boost::spirit::qi;
namespace karma = boost::spirit::karma;

struct ast;

//typedef boost::make_recursive_variant<boost::recursive_wrapper<ast> >::type node;
typedef boost::variant<boost::recursive_wrapper<ast> > node;

struct ast {
    std::vector<int> value;
    std::vector<node> children;

    ast& operator+=(ast&& other) {
        std::move(other.value.begin(),    other.value.end(),    back_inserter(value));
        std::move(other.children.begin(), other.children.end(), back_inserter(children));
        return *this;
    }
};

BOOST_FUSION_ADAPT_STRUCT(ast,
    (std::vector<int>,value)
    (std::vector<node>,children)
)

template <typename It, typename Skipper = qi::space_type>
struct grammar : qi::grammar<It, ast(), Skipper>
{
    grammar() : grammar::base_type(start) {
        using namespace qi;
        start = '{' >> -(int_ % ',') >> ';' >> -(start % ',') >> '}';
        BOOST_SPIRIT_DEBUG_NODES((start));
    }
  private:
    qi::rule<It, ast(), Skipper> start;
};

// for output:
static inline std::ostream& operator<<(std::ostream& os, ast const& v) {
    using namespace karma;
    rule<boost::spirit::ostream_iterator, ast()> r;
    r = '{' << -(int_ % ',') << ';' << -((r|eps) % ',')  << '}';
    return os << format(r, v);
}

template <typename It> ast parse(It f, It l) 
{
    ast parsed;

    static grammar<It> g;
    bool ok = qi::phrase_parse(f,l,g,qi::space,parsed);

    if (!ok || (f!=l)) {
        std::cout << "Parse failure\n";
        std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
        exit(255);
    }

    return parsed;
}

int main() {
    ast merged;
    for(std::string const& input : {
                "{1 ,2 ,3 ;{4 ;{9 , 8 ;}},{5 ,6 ;}}",
                "{10,20,30;{40;{90, 80;}},{50,60;}}",
            })
    {
        merged += parse(input.begin(), input.end());
        std::cout << "merged + " << input << " --> " << merged << "\n";
    }
}

当然,它会打印:

merged + {1 ,2 ,3 ;{4 ;{9 , 8 ;}},{5 ,6 ;}} --> {1,2,3;{4;{9,8;}},{5,6;}}
merged + {10,20,30;{40;{90, 80;}},{50,60;}} --> {1,2,3,10,20,30;{4;{9,8;}},{5,6;},{40;{90,80;}},{50,60;}}

更新

在这个简单的示例中,您可以将集合绑定到解析调用中的属性。如果没有移动元素所需的 operator+= 调用,也会发生同样的事情,因为规则被写入自动 append 到绑定的容器属性。

CAVEAT: A distinct disadvantage of modifying the target value in-place is what happens if parsing fails. In the version the merged value will then be "undefined" (has received partial information from the failed parse).

So if you want to parse inputs "atomically", the first, more explicit approach is a better fit.

所以下面是一个稍微简短的写法:

Live On Coliru

// #define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>

namespace qi = boost::spirit::qi;
namespace karma = boost::spirit::karma;

struct ast;

//typedef boost::make_recursive_variant<boost::recursive_wrapper<ast> >::type node;
typedef boost::variant<boost::recursive_wrapper<ast> > node;

struct ast {
    std::vector<int> value;
    std::vector<node> children;
};

BOOST_FUSION_ADAPT_STRUCT(ast,
    (std::vector<int>,value)
    (std::vector<node>,children)
)

template <typename It, typename Skipper = qi::space_type>
struct grammar : qi::grammar<It, ast(), Skipper>
{
    grammar() : grammar::base_type(start) {
        using namespace qi;
        start = '{' >> -(int_ % ',') >> ';' >> -(start % ',') >> '}';
        BOOST_SPIRIT_DEBUG_NODES((start));
    }
  private:
    qi::rule<It, ast(), Skipper> start;
};

// for output:
static inline std::ostream& operator<<(std::ostream& os, ast const& v) {
    using namespace karma;
    rule<boost::spirit::ostream_iterator, ast()> r;
    r = '{' << -(int_ % ',') << ';' << -((r|eps) % ',')  << '}';
    return os << format(r, v);
}

template <typename It> void parse(It f, It l, ast& into) 
{
    static grammar<It> g;
    bool ok = qi::phrase_parse(f,l,g,qi::space,into);

    if (!ok || (f!=l)) {
        std::cout << "Parse failure\n";
        std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
        exit(255);
    }
}

int main() {
    ast merged;
    for(std::string const& input : {
            "{1 ,2 ,3 ;{4 ;{9 , 8 ;}},{5 ,6 ;}}",
            "{10,20,30;{40;{90, 80;}},{50,60;}}",
            })
    {
        parse(input.begin(), input.end(), merged);
        std::cout << "merged + " << input << " --> " << merged << "\n";
    }
}

仍然打印