Boost Spirit x3——用其他解析器参数化解析器
Boost Spirit x3 -- Parameterizing Parsers with other Parsers
我没有太多的代码可以展示这个,因为我还没有设法让任何东西工作,但高级问题是我正在尝试为一个创建一系列解析器相关语言家族。我的意思是,这些语言将共享许多相同的结构,但不会完全重叠。作为一个简单的例子,假设我有一个由一些参数化的 AST(在这个例子中完全是人为的)'leaf' type:
template <typename t>
struct fooT {
std::string name;
t leaf;
};
一种语言可能 t
实例化为 int
,另一种语言可能实例化为 double
。我想做的是创建一个模板化的 class 或者我可以用不同的 t
和相应的解析器规则实例化的东西,这样我就可以生成一系列组合的解析器。
在我的真实示例中,我有一堆跨语言相同的嵌套结构,但在 AST 的边缘只有几个小的变化,所以如果我不能将解析器组合成好方法,我最终会复制一堆解析规则、AST 节点等。实际上我已经通过 not 将它放在 class 中来让它工作,而且非常仔细安排我的头文件和导入,以便我可以拥有 'dangling' 具有可以组装的特殊名称的解析器规则。 一个很大的缺点是我不能在同一个程序中包含多种不同语言的解析器——正是因为出现了名称冲突。
有人知道我该如何解决这个问题吗?
X3 的优点在于您可以像最初定义解析器一样轻松地生成解析器。
例如
template <typename T> struct AstNode {
std::string name;
T leaf;
};
现在让我们定义一个通用的解析器生成器:
namespace Generic {
template <typename T> auto leaf = x3::eps(false);
template <> auto leaf<int>
= "0x" >> x3::int_parser<uintmax_t, 16>{};
template <> auto leaf<std::string>
= x3::lexeme['"' >> *~x3::char_('"') >> '"'];
auto no_comment = x3::space;
auto hash_comments = x3::space |
x3::lexeme['#' >> *(x3::char_ - x3::eol)] >> (x3::eol | x3::eoi);
auto c_style_comments = x3::space |
"/*" >> x3::lexeme[*(x3::char_ - "*/")] >> "*/";
auto cxx_style_comments = c_style_comments |
x3::lexeme["//" >> *(x3::char_ - x3::eol)] >> (x3::eol | x3::eoi);
auto name = leaf<std::string>;
template <typename T> auto parseNode(auto heading, auto skipper) {
return x3::skip(skipper)[
x3::as_parser(heading) >> name >> ":" >> leaf<T>
];
}
}
这使我们能够用各种叶类型和船长样式组成各种语法:
namespace Language1 {
static auto const grammar =
Generic::parseNode<int>("value", Generic::no_comment);
}
namespace Language2 {
static auto const grammar =
Generic::parseNode<std::string>("line", Generic::cxx_style_comments);
}
让我们演示一下:
#include <boost/spirit/home/x3.hpp>
#include <boost/fusion/adapted.hpp>
#include <iomanip>
namespace x3 = boost::spirit::x3;
template <typename T> struct AstNode {
std::string name;
T leaf;
};
BOOST_FUSION_ADAPT_TPL_STRUCT((T), (AstNode)(T), name, leaf)
namespace Generic {
template <typename T> auto leaf = x3::eps(false);
template <> auto leaf<int>
= "0x" >> x3::uint_parser<uintmax_t, 16>{};
template <> auto leaf<std::string>
= x3::lexeme['"' >> *~x3::char_('"') >> '"'];
auto no_comment = x3::space;
auto hash_comments = x3::space |
x3::lexeme['#' >> *(x3::char_ - x3::eol)] >> (x3::eol | x3::eoi);
auto c_style_comments = x3::space |
"/*" >> x3::lexeme[*(x3::char_ - "*/")] >> "*/";
auto cxx_style_comments = c_style_comments |
x3::lexeme["//" >> *(x3::char_ - x3::eol)] >> (x3::eol | x3::eoi);
auto name = leaf<std::string>;
template <typename T> auto parseNode(auto heading, auto skipper) {
return x3::skip(skipper)[
x3::as_parser(heading) >> name >> ":" >> leaf<T>
];
}
}
namespace Language1 {
static auto const grammar =
Generic::parseNode<int>("value", Generic::no_comment);
}
namespace Language2 {
static auto const grammar =
Generic::parseNode<std::string>("line", Generic::cxx_style_comments);
}
void test(auto const& grammar, std::string_view text, auto ast) {
auto f = text.begin(), l = text.end();
std::cout << "\nParsing: " << std::quoted(text, '\'') << "\n";
if (parse(f, l, grammar, ast)) {
std::cout << " -> {name:" << ast.name << ",value:" << ast.leaf << "}\n";
} else {
std::cout << " -- Failed " << std::quoted(text, '\'') << "\n";
}
}
int main() {
test(Language1::grammar, R"(value "one": 0x01)", AstNode<int>{});
test(
Language2::grammar,
R"(line "Hamlet": "There is nothing either good or bad, but thinking makes it so.")",
AstNode<std::string>{});
test(
Language2::grammar,
R"(line // rejected: "Hamlet": "To be ..."
"King Lear": /*hopefully less trite:*/"As flies to wanton boys are we to the gods")",
AstNode<std::string>{});
}
版画
Parsing: 'value "one": 0x01'
-> {name:one,value:1}
Parsing: 'line "Hamlet": "There is nothing either good or bad, but thinking makes it so."'
-> {name:Hamlet,value:There is nothing either good or bad, but thinking makes it so.}
Parsing: 'line // rejected: "Hamlet": "To be ..."
"King Lear": /*hopefully less trite:*/"As flies to wanton boys are we to the gods"'
-> {name:King Lear,value:As flies to wanton boys are we to the gods}
高级
对于高级场景(您需要跨交易单元分离规则声明和定义 and/or,您需要动态切换),您可以使用 x3::any_rule<>
holder。
我没有太多的代码可以展示这个,因为我还没有设法让任何东西工作,但高级问题是我正在尝试为一个创建一系列解析器相关语言家族。我的意思是,这些语言将共享许多相同的结构,但不会完全重叠。作为一个简单的例子,假设我有一个由一些参数化的 AST(在这个例子中完全是人为的)'leaf' type:
template <typename t>
struct fooT {
std::string name;
t leaf;
};
一种语言可能 t
实例化为 int
,另一种语言可能实例化为 double
。我想做的是创建一个模板化的 class 或者我可以用不同的 t
和相应的解析器规则实例化的东西,这样我就可以生成一系列组合的解析器。
在我的真实示例中,我有一堆跨语言相同的嵌套结构,但在 AST 的边缘只有几个小的变化,所以如果我不能将解析器组合成好方法,我最终会复制一堆解析规则、AST 节点等。实际上我已经通过 not 将它放在 class 中来让它工作,而且非常仔细安排我的头文件和导入,以便我可以拥有 'dangling' 具有可以组装的特殊名称的解析器规则。 一个很大的缺点是我不能在同一个程序中包含多种不同语言的解析器——正是因为出现了名称冲突。
有人知道我该如何解决这个问题吗?
X3 的优点在于您可以像最初定义解析器一样轻松地生成解析器。
例如
template <typename T> struct AstNode {
std::string name;
T leaf;
};
现在让我们定义一个通用的解析器生成器:
namespace Generic {
template <typename T> auto leaf = x3::eps(false);
template <> auto leaf<int>
= "0x" >> x3::int_parser<uintmax_t, 16>{};
template <> auto leaf<std::string>
= x3::lexeme['"' >> *~x3::char_('"') >> '"'];
auto no_comment = x3::space;
auto hash_comments = x3::space |
x3::lexeme['#' >> *(x3::char_ - x3::eol)] >> (x3::eol | x3::eoi);
auto c_style_comments = x3::space |
"/*" >> x3::lexeme[*(x3::char_ - "*/")] >> "*/";
auto cxx_style_comments = c_style_comments |
x3::lexeme["//" >> *(x3::char_ - x3::eol)] >> (x3::eol | x3::eoi);
auto name = leaf<std::string>;
template <typename T> auto parseNode(auto heading, auto skipper) {
return x3::skip(skipper)[
x3::as_parser(heading) >> name >> ":" >> leaf<T>
];
}
}
这使我们能够用各种叶类型和船长样式组成各种语法:
namespace Language1 {
static auto const grammar =
Generic::parseNode<int>("value", Generic::no_comment);
}
namespace Language2 {
static auto const grammar =
Generic::parseNode<std::string>("line", Generic::cxx_style_comments);
}
让我们演示一下:
#include <boost/spirit/home/x3.hpp>
#include <boost/fusion/adapted.hpp>
#include <iomanip>
namespace x3 = boost::spirit::x3;
template <typename T> struct AstNode {
std::string name;
T leaf;
};
BOOST_FUSION_ADAPT_TPL_STRUCT((T), (AstNode)(T), name, leaf)
namespace Generic {
template <typename T> auto leaf = x3::eps(false);
template <> auto leaf<int>
= "0x" >> x3::uint_parser<uintmax_t, 16>{};
template <> auto leaf<std::string>
= x3::lexeme['"' >> *~x3::char_('"') >> '"'];
auto no_comment = x3::space;
auto hash_comments = x3::space |
x3::lexeme['#' >> *(x3::char_ - x3::eol)] >> (x3::eol | x3::eoi);
auto c_style_comments = x3::space |
"/*" >> x3::lexeme[*(x3::char_ - "*/")] >> "*/";
auto cxx_style_comments = c_style_comments |
x3::lexeme["//" >> *(x3::char_ - x3::eol)] >> (x3::eol | x3::eoi);
auto name = leaf<std::string>;
template <typename T> auto parseNode(auto heading, auto skipper) {
return x3::skip(skipper)[
x3::as_parser(heading) >> name >> ":" >> leaf<T>
];
}
}
namespace Language1 {
static auto const grammar =
Generic::parseNode<int>("value", Generic::no_comment);
}
namespace Language2 {
static auto const grammar =
Generic::parseNode<std::string>("line", Generic::cxx_style_comments);
}
void test(auto const& grammar, std::string_view text, auto ast) {
auto f = text.begin(), l = text.end();
std::cout << "\nParsing: " << std::quoted(text, '\'') << "\n";
if (parse(f, l, grammar, ast)) {
std::cout << " -> {name:" << ast.name << ",value:" << ast.leaf << "}\n";
} else {
std::cout << " -- Failed " << std::quoted(text, '\'') << "\n";
}
}
int main() {
test(Language1::grammar, R"(value "one": 0x01)", AstNode<int>{});
test(
Language2::grammar,
R"(line "Hamlet": "There is nothing either good or bad, but thinking makes it so.")",
AstNode<std::string>{});
test(
Language2::grammar,
R"(line // rejected: "Hamlet": "To be ..."
"King Lear": /*hopefully less trite:*/"As flies to wanton boys are we to the gods")",
AstNode<std::string>{});
}
版画
Parsing: 'value "one": 0x01'
-> {name:one,value:1}
Parsing: 'line "Hamlet": "There is nothing either good or bad, but thinking makes it so."'
-> {name:Hamlet,value:There is nothing either good or bad, but thinking makes it so.}
Parsing: 'line // rejected: "Hamlet": "To be ..."
"King Lear": /*hopefully less trite:*/"As flies to wanton boys are we to the gods"'
-> {name:King Lear,value:As flies to wanton boys are we to the gods}
高级
对于高级场景(您需要跨交易单元分离规则声明和定义 and/or,您需要动态切换),您可以使用 x3::any_rule<>
holder。