解析字符串以创建几何

Parsing a string to create a geometry

开发字符串解析器以创建几何图形的算法是什么?几何图形分两步生成:第一步,我们创建图元;第二,我们将基元组合成对象。

语法在下面的字符串中显示。

string str="[GEOMETRY]    
    PRIMITIVE1=SPHERE(RADIUS=5.5);  
    PRIMITIVE2=BOX(A=-5.2, B=7.3);  
    //...  
    OBJECT1=PRIMITIVE2*(-PRIMITIVE1);  
    //..."

class PRIMITIVE{
    int number;
public:
    Primitive& operator+ (Primitive& primitive) {}; //overloading arithmetic operations
    Primitive& operator* (Primitive& primitive) {};
    Primitive& operator- (Primitive& primitive) {};
    virtual bool check_in_point_inside_primitive = 0;
};

class SPHERE:public PRIMITIVE{
    double m_radius;
public:
    SPHERE(double radius): m_radius(radius) {};  //In which part of the parser to create objects?
    bool check_in_point_inside_sphere(Point& point){};
};

class BOX:public PRIMITIVE{
    double m_A;
    double m_B;
public:
    BOX(double A, double B): m_A(A), m_B(B) {};
    bool check_in_point_inside_box(Point& point){};
};

class OBJECT{
    int number;
    PRIMITIVE& primitive;
public:
    OBJECT(){};
    bool check_in_point_inside_object(Primitive& PRIMITIVE1, Primitive& PRIMITIVE2, Point& point){
        //>How to construct a function from an expression 'PRIMITIVE2*(-PRIMITIVE1)' when parsing?
    }
};
  1. 如何解析字符串PRIMITIVE1=SPHERE(RADIUS=5.5)并传递一个参数给SPHERE()的构造函数?如何识别名称为 PRIMITIVE 1 的对象以在 OBJECT 中调用它?是否可以创建一个 pair<PRIMITIVE1,SPHERE(5.5)> 并将所有图元存储在地图中?

  2. 如何解析 OBJECT1 的字符串并从 OBJECT1 中的表达式 PRIMITIVE2*(-PRIMITIVE1) 构造函数?在确定每个点相对于对象的位置时,将多次需要此表达式。

  3. 如何使用boost::spirit完成这个任务?使用 boost::spirit::lex 标记字符串,然后使用 boost::spirit::qi?

    制定规则

作为手指练习,尽管我发现所选的虚拟类型层次结构存在严重问题,但让我们尝试制作一个 value-oriented 基元容器,可以通过它们的 id 索引(ById ):

Live On Coliru

#include <boost/intrusive/set.hpp>
#include <boost/poly_collection/base_collection.hpp>
#include <iostream>
namespace bi = boost::intrusive;

struct Point {
};

using IndexHook = bi::set_member_hook<bi::link_mode<bi::auto_unlink>>;

class Primitive {
    int _id;

  public:
    struct ById {
        bool operator()(auto const&... oper) const { return std::less<>{}(access(oper)...); }

      private:
        static int access(int id) { return id; }
        static int access(Primitive const& p) { return p._id; }
    };

    IndexHook _index;

    Primitive(int id) : _id(id) {}
    virtual ~Primitive() = default;
    int id() const { return _id; }

    Primitive& operator+= (Primitive const& primitive) { return *this; } //overloading arithmetic operations
    Primitive& operator*= (Primitive const& primitive) { return *this; }
    Primitive& operator-= (Primitive const& primitive) { return *this; }
    virtual bool check_in_point_inside(Point const&) const = 0;
};

using Index =
    bi::set<Primitive, bi::constant_time_size<false>,
            bi::compare<Primitive::ById>,
            bi::member_hook<Primitive, IndexHook, &Primitive::_index>>;

class Sphere : public Primitive {
    double _radius;

  public:
    Sphere(int id, double radius)
        : Primitive(id)
        , _radius(radius) {} // In which part of the parser to create objects?
    bool check_in_point_inside(Point const& point) const override { return false; }
};

class Box : public Primitive {
    double _A;
    double _B;

  public:
    Box(int id, double A, double B) : Primitive(id), _A(A), _B(B) {}
    bool check_in_point_inside(Point const& point) const override { return false; }
};

class Object{
    int _id;
    Primitive& _primitive;

  public:
    Object(int id, Primitive& p) : _id(id), _primitive(p) {}

    bool check_in_point_inside_object(Primitive const& p1, Primitive const& p2,
                                      Point const& point) const
    {
        //>How to construct a function from an expression
        //'PRIMITIVE2*(-PRIMITIVE1)' when parsing?
        return false;
    }
};

using Primitives = boost::poly_collection::base_collection<Primitive>;

int main() {
    Primitives test;
    test.insert(Sphere{2, 4.0});
    test.insert(Sphere{4, 4.0});
    test.insert(Box{2, 5, 6});
    test.insert(Sphere{1, 4.0});
    test.insert(Box{3, 5, 6});

    Index idx;
    for (auto& p : test)
        if (not idx.insert(p).second)
            std::cout << "Duplicate id " << p.id() << " not indexed\n";

    for (auto& p : idx)
        std::cout << typeid(p).name() << " " << p.id() << "\n";

    std::cout << "---\n";

    for (auto& p : test)
        std::cout << typeid(p).name() << " " << p.id() << "\n";
}

版画

Duplicate id 2 not indexed
6Sphere 1
3Box 2
3Box 3
6Sphere 4
---
3Box 2
3Box 3
6Sphere 2
6Sphere 4
6Sphere 1

到目前为止一切顺利。这是在处理 Spirit 语法中的虚拟类型时防止各种痛苦的重要组成部分¹

PS: I've since dropped the idea of intrusive_set. It doesn't work because the base_container moves items around on reallocation, and that unlinks the items from their intrusive set.

Instead, see below for an approach that doesn't try to resolve ids during the parse.


解析原语

我们从 PRIMITIVE1 中获取 ID。我们可以在自然地解析基元本身之前将它存储在某个地方,然后在提交时设置它的 id。

让我们从为解析器定义状态 object 开始:

struct State {
    Ast::Id         next_id;
    Ast::Primitives primitives;
    Ast::Objects    objects;

    template <typename... T> void commit(boost::variant<T...>& val) {
        boost::apply_visitor([this](auto& obj) { commit(obj); }, val);
    }

    template <typename T> void commit(T& primitiveOrExpr) {
        auto id = std::exchange(next_id, 0);
        if constexpr (std::is_base_of_v<Ast::Primitive, T>) {
            primitiveOrExpr.id = id;
            primitives.insert(std::move(primitiveOrExpr));
        } else {
            objects.push_back(Ast::Object{id, std::move(primitiveOrExpr)});
        }
    }
};

如您所见,我们只有一个地方可以存储基元,objects。然后是 next_id 的临时存储,同时我们仍在解析下一个实体。

commit 函数有助于对解析器规则的产品进行排序。碰巧的是,它们可以是变体,这就是为什么我们在变体上为 commit 分配 apply_visitor

Again, as the footnote¹ explains, Spirit's natural attribute synthesis favors static polymorphism.

我们现在需要的语义动作是:

static inline auto& state(auto& ctx) { return get<State>(ctx); }
auto draft = [](auto& ctx) { state(ctx).next_id = _attr(ctx); };
auto commit = [](auto& ctx) { state(ctx).commit(_attr(ctx)); };

现在让我们跳到原语:

auto sphere = as<Ast::Sphere>(eps >> "sphere" >>'(' >> param("radius") >> ')');
auto box    = as<Ast::Box>(eps >> "box" >> '(' >> param('a') >> ',' >> param('b') >> ')');
auto primitive =
    ("primitive" >> uint_[draft] >> '=' >> (sphere | box)[commit]) > ';';

这仍然有点作弊,因为我使用了 param 帮助程序来减少打字:

auto number = as<Ast::Number>(double_, "number");
auto param(auto name, auto p) { return eps >> omit[name] >> '=' >> p; }
auto param(auto name) { return param(name, number); }

如您所见,我已经假设大多数参数都具有数值性质。

Object 到底是什么?

看了一会儿,我得出结论,实际上 Object 被定义为一个与表达式相关联的 ID 号(OBJECT1、OBJECT2...)。表达式可以引用基元并具有一些一元和二元运算符。

让我们为此绘制一个 AST:

using Number = double;
struct RefPrimitive { Id id; };
struct Binary;
struct Unary;

using Expr = boost::variant<         //
    Number,                          //
    RefPrimitive,                    //
    boost::recursive_wrapper<Unary>, //
    boost::recursive_wrapper<Binary> //
    >;

struct Unary { char op; Expr oper; };
struct Binary { Expr lhs; char op; Expr rhs; };
struct Object { Id   id; Expr expr; };

现在解析为该表达式 AST

真的是1:1每个Ast节点类型的规则。例如:

auto ref_prim = as<Ast::RefPrimitive>(lexeme["primitive" >> uint_]);

现在许多表达式规则可以递归,所以我们需要通过 BOOST_SPIRIT_DEFINE:

定义的声明规则
// object expression grammar
rule<struct simple_tag, Ast::Expr>  simple{"simple"};
rule<struct unary_tag,  Ast::Unary> unary{"unary"};
rule<struct expr_tag,   Ast::Expr>  expr{"expr"};
rule<struct term_tag,   Ast::Expr>  term{"term"};
rule<struct factor_tag, Ast::Expr>  factor{"factor"};

如您所知,其中一些不是 1:1 Ast 节点,主要是因为递归和运算符优先级的差异(term vs factor vs. simple)。用规则定义更容易看:

auto unary_def  = char_("-+") >> simple;
auto simple_def = ref_prim | unary | '(' >> expr >> ")";
auto factor_def = simple;
auto term_def   = factor[assign] >> *(char_("*/") >> term)[make_binary];
auto expr_def   = term[assign] >> *(char_("-+") >> expr)[make_binary];

因为 none 的规则实际上公开了一个 Binary,自动属性传播在那里并不方便²。相反,我们使用 assignmake_binary 语义操作:

auto assign = [](auto& ctx) { _val(ctx) = _attr(ctx); };
auto make_binary = [](auto& ctx) {
    using boost::fusion::at_c;
    auto& attr = _attr(ctx);
    auto  op   = at_c<0>(attr);
    auto& rhs  = at_c<1>(attr);
    _val(ctx)  = Ast::Binary { _val(ctx), op, rhs };
};

最后,让我们将定义与声明的规则联系起来(使用它们的标记类型):

BOOST_SPIRIT_DEFINE(simple, unary, expr, term, factor)

我们只需要类似于 primitive 的一行:

auto object =
    ("object" >> uint_[draft] >> '=' >> (expr)[commit]) > ';';

我们可以通过将每一行定义为原语来完成|object:

auto line = primitive | object;
auto file = no_case[skip(ws_comment)[*eol >> "[geometry]" >> (-line % eol) >> eoi]];

在顶层,我们期望 [GEOMETRY] header,指定我们希望不区分大小写并且... ws_comment 将被跳过³:

auto ws_comment = +(blank | lexeme["//" >> *(char_ - eol) >> eol]);

这样我们也可以忽略 // comments

现场演示时间

Live On Compiler Explorer

//#define BOOST_SPIRIT_X3_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/poly_collection/base_collection.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <list>
#include <map>
namespace x3 = boost::spirit::x3;

namespace Ast {
    using Id     = uint32_t;
    struct Point { }; // ?? where does this belong?
    struct Primitive {
        Id id;
        virtual ~Primitive() = default;
    };
    struct Sphere : Primitive { double radius; };
    struct Box : Primitive { double a, b; };

    using Number = double;
    struct RefPrimitive { Id id; };
    struct Binary;
    struct Unary;

    using Expr = boost::variant<         //
        Number,                          //
        RefPrimitive,                    //
        boost::recursive_wrapper<Unary>, //
        boost::recursive_wrapper<Binary> //
        >;

    struct Unary { char op; Expr oper; };
    struct Binary { Expr lhs; char op; Expr rhs; };
    struct Object { Id   id; Expr expr; };
    using Primitives = boost::poly_collection::base_collection<Primitive>;
    using Objects    = std::list<Object>;
    using Index      = std::map<Id, std::reference_wrapper<Primitive const>>;

    std::ostream& operator<<(std::ostream& os, Primitive const& p) {
        return os << boost::core::demangle(typeid(p).name()) << " "
                  << "(id: " << p.id << ")";
    }
    std::ostream& operator<<(std::ostream& os, Object const& o) {
        return os << "object(id:" << o.id << ", expr:" << o.expr << ")";
    }
    std::ostream& operator<<(std::ostream& os, RefPrimitive ref) {
        return os << "reference(prim:" << ref.id << ")";
    }
    std::ostream& operator<<(std::ostream& os, Binary const& b) {
        return os << '(' << b.lhs << b.op << b.rhs << ')';
    }
    std::ostream& operator<<(std::ostream& os, Unary const& u) {
        return os << '(' << u.op << u.oper << ')';
    }
} // namespace Ast

BOOST_FUSION_ADAPT_STRUCT(Ast::Primitive, id)
BOOST_FUSION_ADAPT_STRUCT(Ast::Sphere, radius)
BOOST_FUSION_ADAPT_STRUCT(Ast::Box, a, b)
BOOST_FUSION_ADAPT_STRUCT(Ast::Object, id)
BOOST_FUSION_ADAPT_STRUCT(Ast::RefPrimitive, id)
BOOST_FUSION_ADAPT_STRUCT(Ast::Unary, op, oper)

namespace Parser {
    using namespace x3;

    struct State {
        Ast::Id         next_id;
        Ast::Primitives primitives;
        Ast::Objects    objects;

        template <typename... T> void commit(boost::variant<T...>& val) {
            boost::apply_visitor([this](auto& obj) { commit(obj); }, val);
        }

        template <typename T> void commit(T& val) {
            auto id = std::exchange(next_id, 0);
            if constexpr (std::is_base_of_v<Ast::Primitive, T>) {
                val.id = id;
                primitives.insert(std::move(val));
            } else {
                objects.push_back(Ast::Object{id, std::move(val)});
            }
        }
    };

    static inline auto& state(auto& ctx) { return get<State>(ctx); }
    auto draft = [](auto& ctx) { state(ctx).next_id = _attr(ctx); };
    auto commit = [](auto& ctx) { state(ctx).commit(_attr(ctx)); };

    template <typename T>
    auto as = [](auto p, char const* name = "as") {
        return rule<struct _, T>{name} = p;
    };

    auto ws_comment = +(blank | lexeme["//" >> *(char_ - eol) >> (eol | eoi)]);

    auto number = as<Ast::Number>(double_, "number");
    auto param(auto name, auto p) { return eps >> omit[name] >> '=' >> p; }
    auto param(auto name) { return param(name, number); }

    auto sphere = as<Ast::Sphere>(eps >> "sphere" >>'(' >> param("radius") >> ')');
    auto box    = as<Ast::Box>(eps >> "box" >> '(' >> param('a') >> ',' >> param('b') >> ')');
    auto primitive =
        ("primitive" >> uint_[draft] >> '=' >> (sphere | box)[commit]) > ';';
    
    auto ref_prim = as<Ast::RefPrimitive>(lexeme["primitive" >> uint_], "ref_prim");

    // object expression grammar
    rule<struct simple_tag, Ast::Expr>  simple{"simple"};
    rule<struct unary_tag,  Ast::Unary> unary{"unary"};
    rule<struct expr_tag,   Ast::Expr>  expr{"expr"};
    rule<struct term_tag,   Ast::Expr>  term{"term"};
    rule<struct factor_tag, Ast::Expr>  factor{"factor"};

    auto assign = [](auto& ctx) { _val(ctx) = _attr(ctx); };
    auto make_binary = [](auto& ctx) {
        using boost::fusion::at_c;
        auto& attr = _attr(ctx);
        auto  op   = at_c<0>(attr);
        auto& rhs  = at_c<1>(attr);
        _val(ctx)  = Ast::Binary { _val(ctx), op, rhs };
    };

    auto unary_def  = char_("-+") >> simple;
    auto simple_def = ref_prim | unary | '(' >> expr >> ")";
    auto factor_def = simple;
    auto term_def   = factor[assign] >> *(char_("*/") >> term)[make_binary];
    auto expr_def   = term[assign] >> *(char_("-+") >> expr)[make_binary];

    BOOST_SPIRIT_DEFINE(simple, unary, expr, term, factor)

    auto object =
        ("object" >> uint_[draft] >> '=' >> (expr)[commit]) > ';';
    auto line = primitive | object;
    auto file = no_case[skip(ws_comment)[*eol >> "[geometry]" >> (-line % eol) >> eoi]];
} // namespace Parser

int main() {
    for (std::string const input :
         {
             R"(
[geometry]    
    primitive1=sphere(radius=5.5);  
    primitive2=box(a=-5.2, b=7.3);  
    //...  
    object1=primitive2*(-primitive1);  
    //...)",
             R"(
[GEOMETRY]    
    PRIMITIVE1=SPHERE(RADIUS=5.5);  
    PRIMITIVE2=BOX(A=-5.2, B=7.3);  
    //...  
    OBJECT1=PRIMITIVE2*(-PRIMITIVE1);  
    //...)",
         }) //
    {
        Parser::State state;

        bool ok = parse(begin(input), end(input),
                        x3::with<Parser::State>(state)[Parser::file]);
        std::cout << "Parse success? " << std::boolalpha << ok << "\n";

        Ast::Index index;

        for (auto& p : state.primitives)
            if (auto[it,ok] = index.emplace(p.id, p); not ok) {
                std::cout << "Duplicate id " << p
                          << " (conflicts with existing " << it->second.get()
                          << ")\n";
            }

        std::cout << "Primitives by ID:\n";
        for (auto& [id, prim] : index)
            std::cout << " - " << prim << "\n";

        std::cout << "Objects in definition order:\n";
        for (auto& obj: state.objects)
            std::cout << " - " << obj << "\n";
    }
}

版画

Parse success? true
Primitives by ID:
 - Ast::Sphere (id: 1)
 - Ast::Box (id: 2)
Objects in definition order:
 - object(id:1, expr:(reference(prim:2)*(-reference(prim:1))))
Parse success? true
Primitives by ID:
 - Ast::Sphere (id: 1)
 - Ast::Box (id: 2)
Objects in definition order:
 - object(id:1, expr:(reference(prim:2)*(-reference(prim:1))))

¹ How can I use polymorphic attributes with boost::spirit::qi parsers?

² 并坚持这一点导致经典 in-efficiency 规则导致大量回溯

³ 在词素之外