Boost Spirit：解析布尔表达式并简化为规范范式

Question

我想用 or、and 和 not 运算符解析一个常见的布尔值，我想我已经使用下面的 Boost Spirit 完成了。在第 2 阶段（或者可能是解析本身的一部分），我希望将布尔值的 AST 转换为 disjunctive canonical normal form，这实际上是 "flattens" 表达式并删除所有分组运算符。

在我的一次尝试中，我创建了下面的 Boost static_visitor，命名为 Transformer。如果子节点和父节点都不是运算符，我首先尝试通过将子节点分配给它的祖父节点来消除双非运算符。我的问题是指当前节点的父节点。似乎没有办法引用当前节点的父节点，因为一旦访问了一个节点，访问函数就会重载 'variant' 的内部类型，从而丢弃对象的 variant 性质。任何帮助表示赞赏。

struct op_or  {};
struct op_and {};
struct op_not {};

typedef std::string var;
template <typename tag> struct binop;
template <typename tag> struct uniop;

typedef boost::variant
    <
        var,
        boost::recursive_wrapper<uniop<op_not>>,
        boost::recursive_wrapper<binop<op_and>>,
        boost::recursive_wrapper<binop<op_or>>
    >
    expr;

template <typename tag> struct uniop
{
    explicit uniop(expr const& o) : exp_u(o) { }
    expr exp_u;
};

template <typename tag> struct binop
{
    explicit binop(expr const& l, expr const& r) : exp_l(l), exp_r(r) { }
    expr exp_l, exp_r;
};

struct transformer : boost::static_visitor<void>
{
    std::deque<std::reference_wrapper<expr>> stk;

    transformer(expr & e)
    {
        stk.push_back(e);
    }

    void operator()(var const& v) const { }

    void operator()(uniop<op_not> & u)
    {
        if (boost::get<uniop<op_not>>(&stk.back().get()) != nullptr)
        {
            stk.back() = u.exp_u;
        }
        else
        {
            stk.push_back(std::ref(u));  // <<=== Fails with "no matching function for call"
            boost::apply_visitor(*this, u.exp_u);
            stk.pop_back();
        }
    }
    void operator()(binop<op_and> & b)
    {
        stk.push_back(std::ref(u));
        boost::apply_visitor(*this, b.exp_l);
        boost::apply_visitor(*this, b.exp_r);
        stk.pop_back();
    }
    void operator()(binop<op_or> & b)
    {
        stk.push_back(std::ref(u));
        boost::apply_visitor(*this, b.exp_l);
        boost::apply_visitor(*this, b.exp_r);
        stk.pop_back();
    }
};

template <typename It, typename Skipper = boost::spirit::qi::space_type>
struct parser : boost::spirit::qi::grammar<It, expr(), Skipper>
{
    parser() : parser::base_type(expr_)
    {
        using namespace boost::phoenix;
        using namespace boost::spirit::qi;

        using boost::spirit::qi::_1;

        expr_  = or_.alias();

        or_  = and_ [ _val = _1 ] >> *("or" >> and_ [ _val = construct<binop<op_or>>(_val, _1) ]);
        and_ = not_ [ _val = _1 ] >> *("and" >> not_ [ _val = construct<binop<op_and>>(_val, _1) ]);
        not_ = "not" > simple [ _val = construct<uniop<op_not>>(_1) ] | simple [ _val = _1 ];

        simple =  '(' > expr_ > ')' | var_;
        var_ = lexeme[ +alpha ];
    }

private:
    boost::spirit::qi::rule<It, var() , Skipper> var_;
    boost::spirit::qi::rule<It, expr(), Skipper> not_, and_, or_, simple, expr_;
};

Answer 1

看来向 DCNF 的转换是 NP 完全的。所以你可以期待做出让步。

你的高度简化的子任务只是消除了双重否定。看起来您试图保留一堆父表达式引用 (stk) 但是：

您没有明确显示提取或return简化表达式的方法（原始表达式将保持不变）
您尝试将 uniop<> 节点作为对类型不匹配的 expr 节点的引用：
```
stk.push_back(std::ref(u));  // <<=== Fails with "no matching function for call"
```
对我来说，这只是
这一事实的另一个症状
```
transformer(expr & e)        {
    stk.push_back(e);
}
```
无法递归到子表达式中。如果是这样，您可以相信周围的 expr& 已经在堆栈上了。 binop/unop 处理程序也是如此，它们都试图将引用推送到当时甚至不存在于范围内的 u ，并且可能意味着推送当前节点，该节点运行到同类类型不匹配。

第一个：`simplify`

我认为用函数式风格编写这些代码要容易得多：而不是 "manipulating" 对象图，让转换 return 转换后的结果 .

这立即意味着您可以保持所有节点类型不变，除非您的节点类型是嵌套否定。外观如下：

struct simplify {
    typedef expr result_type;

    // in general, just identity transform
    template <typename E> auto operator()(E const& e) const { return e; }

    // only handle these:
    auto operator()(expr const& e) const { return apply_visitor(*this, e); }
    expr operator()(unop<op_not> const& e) const {
        if (auto nested_negation = boost::strict_get<unop<op_not>>(&e.exp_u)) {
            return nested_negation->exp_u;
        }
        return e;
    }
};

一个简单的测试程序将是：

Live On Coliru

std::vector<expr> tests {
    "a",
    NOT{"a"},
    AND{"a", "b"},
    OR{"a","b"},
    AND{NOT{"a"},NOT{"b"}},
    NOT{{NOT{"a"}}},
};

const simplifier simplify{};

for (expr const& expr : tests) {
    std::cout << std::setw(30) << str(expr) << " -> " << simplify(expr) << "\n";
}

正在打印：

                       "a" -> "a"
                  NOT{"a"} -> NOT{"a"}
              AND{"a","b"} -> AND{"a","b"}
               OR{"a","b"} -> OR{"a","b"}
    AND{NOT{"a"},NOT{"b"}} -> AND{NOT{"a"},NOT{"b"}}
             NOT{NOT{"a"}} -> "a"

使用堆栈/变异

使用堆栈的类似方法**看起来*同样容易：

HERE BE DRAGONS

struct stack_simplifier {
    typedef void result_type;
    std::deque<std::reference_wrapper<expr>> stk;

    void operator()(expr& e) {
        stk.push_back(e);
        apply_visitor(*this, e);
        stk.pop_back();
    }

    template <typename Other>
    void operator()(Other&) {}

    void operator()(unop<op_not>& e) {
        if (auto nested_negation = boost::strict_get<unop<op_not>>(&e.exp_u)) {
            stk.back().get() = nested_negation->exp_u;
        }
    }
};

用法将不再是 const（因为函数不纯），expr 参数也是如此（将被改变）：

for (expr expr : tests) {
    std::cout << std::setw(30) << str(expr);

    stack_simplifier{}(expr);
    std::cout << " -> " << expr << "\n";
}

它/确实/似乎有效 (Live On Coliru)，但也有明显的缺点：

堆栈没有实际用途，只有顶部元素会被检查（您可以 replace it with a pointer to the current expression node）
仿函数对象是non-pure/non-const
表达式树在遍历时正在变异。这只是一个定时炸弹，等待您调用 Undefined Behaviour: in
```
void operator()(unop<op_not>& e) {
    if (auto nested_negation = boost::strict_get<unop<op_not>>(&e.exp_u)) {
        stk.back().get() = nested_negation->exp_u;
    }
}
```
在对堆栈顶部的表达式赋值后，对 e 的引用是悬空的。 nested_negation 也是如此。超出该点的取消引用是 UB.

现在在这个简单的场景中（折叠双重否定）它似乎不太难在心理上检查这是否真的没问题。错误

事实证明 operator= 在变体上调用 variant_assign，它看起来像这样：

void variant_assign(const variant& rhs)
{
    // If the contained types are EXACTLY the same...
    if (which_ == rhs.which_)
    {
        // ...then assign rhs's storage to lhs's content:
        detail::variant::assign_storage visitor(rhs.storage_.address());
        this->internal_apply_visitor(visitor);
    }
    else
    {
        // Otherwise, perform general (copy-based) variant assignment:
        assigner visitor(*this, rhs.which());
        rhs.internal_apply_visitor(visitor); 
    }
}

assigner 访问者有一个致命的细节（选择了一个 nothrow-aware 重载）：

template <typename RhsT, typename B1, typename B2>
void assign_impl(
      const RhsT& rhs_content
    , mpl::true_ // has_nothrow_copy
    , B1 // is_nothrow_move_constructible
    , B2 // has_fallback_type
    ) const BOOST_NOEXCEPT
{
    // Destroy lhs's content...
    lhs_.destroy_content(); // nothrow

    // ...copy rhs content into lhs's storage...
    new(lhs_.storage_.address())
        RhsT( rhs_content ); // nothrow

    // ...and indicate new content type:
    lhs_.indicate_which(rhs_which_); // nothrow
}

OOPS 原来是left-handside先被破坏了。然而在

    stk.back().get() = nested_negation->exp_u;

右侧是左侧的子对象 (!!!)。在这里避免 UB 的不直观方法是临时复制¹：

    expr tmp = nested_negation->exp_u;
    stk.back().get() = tmp;

假设您正在应用类似德摩根定律的转换。如果子表达式中（也）有嵌套否定怎么办？

It seems to me that the mutating approach is simply unnecessarily error-prone.

递归、不可变转换a.k.a。喜悦

到目前为止，这些方法还有另一个问题。嵌套的子表达式在这里不被转换。例如

  NOT{NOT{AND{"a",NOT{NOT{"b"}}}}} -> AND{"a",NOT{NOT{"b"}}}

而不是所需的 AND{"a","b"}。这在纯功能方法中很容易解决：

struct simplifier {
    typedef expr result_type;

    template <typename T> auto operator()(T const& v) const { return call(v); }

  private:
    auto call(var const& e) const { return e; }
    auto call(expr const& e) const {
        auto s = apply_visitor(*this, e);
        return s;
    }
    expr call(unop<op_not> const& e) const {
        if (auto nested_negation = boost::strict_get<unop<op_not>>(&e.exp_u)) {
            return call(nested_negation->exp_u);
        }

        return unop<op_not> {call(e.exp_u)};
    }
    template <typename Op> auto call(binop<Op> const& e) const {
        return binop<Op> {call(e.exp_l), call(e.exp_r)};
    }
};

一切仍然是不可变的，但我们处理所有类型的表达式以递归它们的子表达式。现在它打印：

Live On Coliru

                               "a" -> "a"
                          NOT{"a"} -> NOT{"a"}
                      AND{"a","b"} -> AND{"a","b"}
                       OR{"a","b"} -> OR{"a","b"}
            AND{NOT{"a"},NOT{"b"}} -> AND{NOT{"a"},NOT{"b"}}
                     NOT{NOT{"a"}} -> "a"
  NOT{NOT{AND{"a",NOT{NOT{"b"}}}}} -> AND{"a","b"}

For completeness, a similar transoformation to the "stack_simplifier": http://coliru.stacked-crooked.com/a/cc5627aa37f0c969

¹ 实际上可能会使用移动语义，但为清楚起见我忽略了

Boost Spirit：解析布尔表达式并简化为规范范式

Boost Spirit: parse boolean expression and reduce to canonical normal form

c++

grammar

boost-spirit

第一个：`simplify`

使用堆栈/变异

递归、不可变转换a.k.a。喜悦

Boost Spirit：解析布尔表达式并简化为规范范式

Boost Spirit: parse boolean expression and reduce to canonical normal form

c++

grammar

boost-spirit

第一个：simplify

使用堆栈/变异

递归、不可变转换a.k.a。喜悦

第一个：`simplify`