Spirit X3,这种错误处理方式有用吗?
Spirit X3, Is this error handling approach useful?
在阅读 error handling 上的 Spirit X3 教程和一些实验之后。我得出了一个结论。
我认为 X3 中的错误处理主题还有一些改进空间。从我的角度来看,一个重要的目标是提供有意义的错误消息。首先也是最重要的是,添加将 _pass(ctx)
成员设置为 false 的语义操作不会这样做,因为 X3 会尝试匹配其他内容。仅抛出 x3::expectation_failure
会过早退出解析函数,即不尝试匹配任何其他内容。所以剩下的就是解析器指令 expect[a]
和解析器 operator>
以及从语义操作中手动抛出 x3::expectation_failure
。我相信关于这个错误处理的词汇太有限了。请考虑以下 X3 PEG 语法行:
const auto a = a1 >> a2 >> a3;
const auto b = b1 >> b2 >> b3;
const auto c = c1 >> c2 >> c3;
const auto main_rule__def =
(
a |
b |
c );
现在对于表达式 a
我不能使用 expect[]
或 operator>
,因为其他替代方法可能有效。我可能是错的,但我认为 X3 要求我拼出可以匹配的替代错误表达式,如果它们匹配,他们可以抛出 x3::expectation_failure
,这很麻烦。
问题是,是否有一种好的方法可以使用当前的 X3 工具检查我的 PEG 构造中的错误条件以及 a、b 和 c 的有序替代项?
如果答案是否定的,我想提出我的想法,为此提供一个合理的解决方案。我相信为此我需要一个新的解析器指令。这个指令应该做什么?它应该在解析 失败 时调用附加的语义操作。该属性显然未使用,但我需要在第一次出现解析不匹配时将 _where
成员设置在迭代器位置上。所以如果a2
失败,_where
应该在a1
结束后置1。让我们调用解析指令 neg_sa
。这意味着否定语义动作。
pseudocode
// semantic actions
auto a_sa = [&](auto& ctx)
{
// add _where to vector v
};
auto b_sa = [&](auto& ctx)
{
// add _where to vector v
};
auto c_sa = [&](auto& ctx)
{
// add _where to vector v
// now we know we have a *real* error.
// find the peak iterator value in the vector v
// the position tells whether it belongs to a, b or c.
// now we can formulate an error message like: “cannot make sense of b upto this position.”
// lastly throw x3::expectation_failure
};
// PEG
const auto a = a1 >> a2 >> a3;
const auto b = b1 >> b2 >> b3;
const auto c = c1 >> c2 >> c3;
const auto main_rule__def =
(
neg_sa[a][a_sa] |
neg_sa[b][b_sa] |
neg_sa[c][c_sa] );
我希望我清楚地表达了这个想法。如果我需要进一步解释,请在评论部分告诉我。
Now for expression a I cannot use expect[] or operator>, as other alternatives might be valid. I could be wrong but I think X3 requires me to spell out alternate wrong expressions that can match and if they match they can throw x3::expectation_failure which is cumbersome.
很简单:
const auto main_rule__def = x3::expect [
a |
b |
c ];
或者,甚至:
const auto main_rule__def = x3::eps > (
a |
b |
c );
If the answer is no, I would like to present my idea to provide a reasonable solution for this. I believe I would need a new parser directive for that. What should this directive do? It should call the attached semantic action when the parse fails instead.
现有的 x3::on_error 功能已经知道如何执行此操作。请注意:它有点复杂,但同样也非常灵活。
基本上,它需要您在 ID 类型上实现一个静态接口(x3::rule<ID, Attr>
,可能 main_rule_class
在您选择的约定中)。存储库中有编译器示例,展示了如何使用它。
Side note: there's both on_success
and on_error
using this paradigm
on_error
成员将在 ID 类型的 default-constructed 副本上调用,参数为 ID().on_error(first, last, expectation_failure_object, context)
。
const auto main_rule__def =
(
neg_sa[a][a_sa] |
neg_sa[b][b_sa] |
neg_sa[c][c_sa] );
老实说,我认为您是在为自己的困惑铺平道路。有 3 个独立的错误操作有什么好处?您如何确定发生了哪个错误?
真的只有两种可能:
- 要么你确实知道需要一个特定的分支并且它失败了(这是一个期望失败,你可以根据定义编码作为[=19中的一个期望点=]、
b
或 c
).
或者您不知道隐含了哪个分支(例如,分支可以从类似的作品开始,但在其中失败了)。在那种情况下,没有人能知道应该调用哪个错误处理程序,所以有多个错误处理程序是无关紧要的。
实际上正确的做法是在更高级别上使 main_rule
失败,这意味着 "none of the possible branches succeeded"。
这是expect[ a | b | c ]
的处理方式。
好吧,冒着把太多东西混为一谈的风险,这里是:
namespace square::peg {
using namespace x3;
const auto quoted_string = lexeme['"' > *(print - '"') > '"'];
const auto bare_string = lexeme[alpha > *alnum] > ';';
const auto two_ints = int_ > int_;
const auto main = quoted_string | bare_string | two_ints;
const auto entry_point = skip(space)[ expect[main] > eoi ];
} // namespace square::peg
应该可以。关键是唯一该期待的东西
points 是使相应分支失败的原因
无疑是正确的分支。 (否则,字面上不会有
很难期望).
有两个次要的 get_info
更漂亮消息的专业化¹,这可能会导致
即使在手动捕获异常时也能显示正确的错误消息:
int main() {
using It = std::string::const_iterator;
for (std::string const input : {
" -89 0038 ",
" \"-89 0038\" ",
" something123123 ;",
// undecidable
"",
// violate expecations, no successful parse
" -89 oops ", // not an integer
" \"-89 0038 ", // missing "
" bareword ", // missing ;
// trailing debris, successful "main"
" -89 3.14 ", // followed by .14
})
{
std::cout << "====== " << std::quoted(input) << "\n";
It iter = input.begin(), end = input.end();
try {
if (parse(iter, end, square::peg::entry_point)) {
std::cout << "Parsed successfully\n";
} else {
std::cout << "Parsing failed\n";
}
} catch (x3::expectation_failure<It> const& ef) {
auto pos = std::distance(input.begin(), ef.where());
std::cout << "Expect " << ef.which() << " at "
<< "\n\t" << input
<< "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^\n";
}
}
}
版画
====== " -89 0038 "
Parsed successfully
====== " \"-89 0038\" "
Parsed successfully
====== " something123123 ;"
Parsed successfully
====== ""
Expect quoted string, bare string or integer number pair at
^
====== " -89 oops "
Expect integral number at
-89 oops
-------^
====== " \"-89 0038 "
Expect '"' at
"-89 0038
--------------^
====== " bareword "
Expect ';' at
bareword
------------^
====== " -89 3.14 "
Expect eoi at
-89 3.14
--------^
这已经超出了大多数人对解析器的期望。
但是:自动化,而且更灵活
我们可能不满足于只报告一个期望和救助。实际上,您可以报告并继续解析,因为只是存在常规不匹配:这就是 on_error
的用武之地。
让我们创建一个标签库:
struct with_error_handling {
template<typename It, typename Ctx>
x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const&) const {
std::string s(f,l);
auto pos = std::distance(f, ef.where());
std::cout << "Expecting " << ef.which() << " at "
<< "\n\t" << s
<< "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^\n";
return error_handler_result::fail;
}
};
现在,我们所要做的就是从 with_error_handling
和 BAM 中导出我们的规则 ID!我们不必编写任何异常处理程序,规则将简单地 "fail" 进行适当的诊断.更重要的是,一些输入可以导致多个(希望有用的)诊断:
auto const eh = [](auto p) {
struct _ : with_error_handling {};
return rule<_> {} = p;
};
const auto quoted_string = eh(lexeme['"' > *(print - '"') > '"']);
const auto bare_string = eh(lexeme[alpha > *alnum] > ';');
const auto two_ints = eh(int_ > int_);
const auto main = quoted_string | bare_string | two_ints;
using main_type = std::remove_cv_t<decltype(main)>;
const auto entry_point = skip(space)[ eh(expect[main] > eoi) ];
现在,main
变成:
for (std::string const input : {
" -89 0038 ",
" \"-89 0038\" ",
" something123123 ;",
// undecidable
"",
// violate expecations, no successful parse
" -89 oops ", // not an integer
" \"-89 0038 ", // missing "
" bareword ", // missing ;
// trailing debris, successful "main"
" -89 3.14 ", // followed by .14
})
{
std::cout << "====== " << std::quoted(input) << "\n";
It iter = input.begin(), end = input.end();
if (parse(iter, end, square::peg::entry_point)) {
std::cout << "Parsed successfully\n";
} else {
std::cout << "Parsing failed\n";
}
}
程序打印:
====== " -89 0038 "
Parsed successfully
====== " \"-89 0038\" "
Parsed successfully
====== " something123123 ;"
Parsed successfully
====== ""
Expecting quoted string, bare string or integer number pair at
^
Parsing failed
====== " -89 oops "
Expecting integral number at
-89 oops
-------^
Expecting quoted string, bare string or integer number pair at
-89 oops
^
Parsing failed
====== " \"-89 0038 "
Expecting '"' at
"-89 0038
--------------^
Expecting quoted string, bare string or integer number pair at
"-89 0038
^
Parsing failed
====== " bareword "
Expecting ';' at
bareword
------------^
Expecting quoted string, bare string or integer number pair at
bareword
^
Parsing failed
====== " -89 3.14 "
Expecting eoi at
-89 3.14
--------^
Parsing failed
属性传播,on_success
当解析器实际上不解析任何东西时,它们就不是很有用,所以让我们添加一些建设性的值处理,同时展示 on_success
:
定义一些 AST 类型来接收属性:
struct quoted : std::string {};
struct bare : std::string {};
using two_i = std::pair<int, int>;
using Value = boost::variant<quoted, bare, two_i>;
确保我们可以打印 Value
s:
static inline std::ostream& operator<<(std::ostream& os, Value const& v) {
struct {
std::ostream& _os;
void operator()(quoted const& v) const { _os << "quoted(" << std::quoted(v) << ")"; }
void operator()(bare const& v) const { _os << "bare(" << v << ")"; }
void operator()(two_i const& v) const { _os << "two_i(" << v.first << ", " << v.second << ")"; }
} vis{os};
boost::apply_visitor(vis, v);
return os;
}
现在,使用旧的 as<>
技巧来强制属性类型,这次使用 error-handling:
作为锦上添花,让我们在with_error_handling
中演示on_success
:
template<typename It, typename Ctx>
void on_success(It f, It l, two_i const& v, Ctx const&) const {
std::cout << "Parsed " << std::quoted(std::string(f,l)) << " as integer pair " << v.first << ", " << v.second << "\n";
}
现在主要未修改主程序(也只打印结果值):
It iter = input.begin(), end = input.end();
Value v;
if (parse(iter, end, square::peg::entry_point, v)) {
std::cout << "Result value: " << v << "\n";
} else {
std::cout << "Parsing failed\n";
}
版画
====== " -89 0038 "
Parsed "-89 0038" as integer pair -89, 38
Result value: two_i(-89, 38)
====== " \"-89 0038\" "
Result value: quoted("-89 0038")
====== " something123123 ;"
Result value: bare(something123123)
====== ""
Expecting quoted string, bare string or integer number pair at
^
Parsing failed
====== " -89 oops "
Expecting integral number at
-89 oops
-------^
Expecting quoted string, bare string or integer number pair at
-89 oops
^
Parsing failed
====== " \"-89 0038 "
Expecting '"' at
"-89 0038
--------------^
Expecting quoted string, bare string or integer number pair at
"-89 0038
^
Parsing failed
====== " bareword "
Expecting ';' at
bareword
------------^
Expecting quoted string, bare string or integer number pair at
bareword
^
Parsing failed
====== " -89 3.14 "
Parsed "-89 3" as integer pair -89, 3
Expecting eoi at
-89 3.14
--------^
Parsing failed
实在是太过分了
我不了解你,但我讨厌这样做 side-effects,更不用说从解析器打印到控制台了。让我们改用 x3::with
。
我们想通过 Ctx&
参数附加到诊断而不是写入
到 on_error
处理程序中的 std::cout
:
struct with_error_handling {
struct diags;
template<typename It, typename Ctx>
x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const& ctx) const {
std::string s(f,l);
auto pos = std::distance(f, ef.where());
std::ostringstream oss;
oss << "Expecting " << ef.which() << " at "
<< "\n\t" << s
<< "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^";
x3::get<diags>(ctx).push_back(oss.str());
return error_handler_result::fail;
}
};
并且在调用站点上,我们可以传递上下文:
std::vector<std::string> diags;
if (parse(iter, end, x3::with<D>(diags) [square::peg::entry_point], v)) {
std::cout << "Result value: " << v;
} else {
std::cout << "Parsing failed";
}
std::cout << " with " << diags.size() << " diagnostics messages: \n";
完整程序还打印诊断信息:
完整列表
//#define BOOST_SPIRIT_X3_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <iomanip>
namespace x3 = boost::spirit::x3;
struct quoted : std::string {};
struct bare : std::string {};
using two_i = std::pair<int, int>;
using Value = boost::variant<quoted, bare, two_i>;
static inline std::ostream& operator<<(std::ostream& os, Value const& v) {
struct {
std::ostream& _os;
void operator()(quoted const& v) const { _os << "quoted(" << std::quoted(v) << ")"; }
void operator()(bare const& v) const { _os << "bare(" << v << ")"; }
void operator()(two_i const& v) const { _os << "two_i(" << v.first << ", " << v.second << ")"; }
} vis{os};
boost::apply_visitor(vis, v);
return os;
}
namespace square::peg {
using namespace x3;
struct with_error_handling {
struct diags;
template<typename It, typename Ctx>
x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const& ctx) const {
std::string s(f,l);
auto pos = std::distance(f, ef.where());
std::ostringstream oss;
oss << "Expecting " << ef.which() << " at "
<< "\n\t" << s
<< "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^";
x3::get<diags>(ctx).push_back(oss.str());
return error_handler_result::fail;
}
};
template <typename T = x3::unused_type> auto const as = [](auto p) {
struct _ : with_error_handling {};
return rule<_, T> {} = p;
};
const auto quoted_string = as<quoted>(lexeme['"' > *(print - '"') > '"']);
const auto bare_string = as<bare>(lexeme[alpha > *alnum] > ';');
const auto two_ints = as<two_i>(int_ > int_);
const auto main = quoted_string | bare_string | two_ints;
using main_type = std::remove_cv_t<decltype(main)>;
const auto entry_point = skip(space)[ as<Value>(expect[main] > eoi) ];
} // namespace square::peg
namespace boost::spirit::x3 {
template <> struct get_info<int_type> {
typedef std::string result_type;
std::string operator()(int_type const&) const { return "integral number"; }
};
template <> struct get_info<square::peg::main_type> {
typedef std::string result_type;
std::string operator()(square::peg::main_type const&) const { return "quoted string, bare string or integer number pair"; }
};
}
int main() {
using It = std::string::const_iterator;
using D = square::peg::with_error_handling::diags;
for (std::string const input : {
" -89 0038 ",
" \"-89 0038\" ",
" something123123 ;",
// undecidable
"",
// violate expecations, no successful parse
" -89 oops ", // not an integer
" \"-89 0038 ", // missing "
" bareword ", // missing ;
// trailing debris, successful "main"
" -89 3.14 ", // followed by .14
})
{
std::cout << "====== " << std::quoted(input) << "\n";
It iter = input.begin(), end = input.end();
Value v;
std::vector<std::string> diags;
if (parse(iter, end, x3::with<D>(diags) [square::peg::entry_point], v)) {
std::cout << "Result value: " << v;
} else {
std::cout << "Parsing failed";
}
std::cout << " with " << diags.size() << " diagnostics messages: \n";
for(auto& msg: diags) {
std::cout << " - " << msg << "\n";
}
}
}
¹ 您可以改用带有名称的规则,避免这个更复杂的技巧
² 在旧版本的库中,您可能需要努力获取 with<>
数据的引用语义:Live On Coliru
在阅读 error handling 上的 Spirit X3 教程和一些实验之后。我得出了一个结论。
我认为 X3 中的错误处理主题还有一些改进空间。从我的角度来看,一个重要的目标是提供有意义的错误消息。首先也是最重要的是,添加将 _pass(ctx)
成员设置为 false 的语义操作不会这样做,因为 X3 会尝试匹配其他内容。仅抛出 x3::expectation_failure
会过早退出解析函数,即不尝试匹配任何其他内容。所以剩下的就是解析器指令 expect[a]
和解析器 operator>
以及从语义操作中手动抛出 x3::expectation_failure
。我相信关于这个错误处理的词汇太有限了。请考虑以下 X3 PEG 语法行:
const auto a = a1 >> a2 >> a3;
const auto b = b1 >> b2 >> b3;
const auto c = c1 >> c2 >> c3;
const auto main_rule__def =
(
a |
b |
c );
现在对于表达式 a
我不能使用 expect[]
或 operator>
,因为其他替代方法可能有效。我可能是错的,但我认为 X3 要求我拼出可以匹配的替代错误表达式,如果它们匹配,他们可以抛出 x3::expectation_failure
,这很麻烦。
问题是,是否有一种好的方法可以使用当前的 X3 工具检查我的 PEG 构造中的错误条件以及 a、b 和 c 的有序替代项?
如果答案是否定的,我想提出我的想法,为此提供一个合理的解决方案。我相信为此我需要一个新的解析器指令。这个指令应该做什么?它应该在解析 失败 时调用附加的语义操作。该属性显然未使用,但我需要在第一次出现解析不匹配时将 _where
成员设置在迭代器位置上。所以如果a2
失败,_where
应该在a1
结束后置1。让我们调用解析指令 neg_sa
。这意味着否定语义动作。
pseudocode
// semantic actions
auto a_sa = [&](auto& ctx)
{
// add _where to vector v
};
auto b_sa = [&](auto& ctx)
{
// add _where to vector v
};
auto c_sa = [&](auto& ctx)
{
// add _where to vector v
// now we know we have a *real* error.
// find the peak iterator value in the vector v
// the position tells whether it belongs to a, b or c.
// now we can formulate an error message like: “cannot make sense of b upto this position.”
// lastly throw x3::expectation_failure
};
// PEG
const auto a = a1 >> a2 >> a3;
const auto b = b1 >> b2 >> b3;
const auto c = c1 >> c2 >> c3;
const auto main_rule__def =
(
neg_sa[a][a_sa] |
neg_sa[b][b_sa] |
neg_sa[c][c_sa] );
我希望我清楚地表达了这个想法。如果我需要进一步解释,请在评论部分告诉我。
Now for expression a I cannot use expect[] or operator>, as other alternatives might be valid. I could be wrong but I think X3 requires me to spell out alternate wrong expressions that can match and if they match they can throw x3::expectation_failure which is cumbersome.
很简单:
const auto main_rule__def = x3::expect [
a |
b |
c ];
或者,甚至:
const auto main_rule__def = x3::eps > (
a |
b |
c );
If the answer is no, I would like to present my idea to provide a reasonable solution for this. I believe I would need a new parser directive for that. What should this directive do? It should call the attached semantic action when the parse fails instead.
现有的 x3::on_error 功能已经知道如何执行此操作。请注意:它有点复杂,但同样也非常灵活。
基本上,它需要您在 ID 类型上实现一个静态接口(x3::rule<ID, Attr>
,可能 main_rule_class
在您选择的约定中)。存储库中有编译器示例,展示了如何使用它。
Side note: there's both
on_success
andon_error
using this paradigm
on_error
成员将在 ID 类型的 default-constructed 副本上调用,参数为 ID().on_error(first, last, expectation_failure_object, context)
。
const auto main_rule__def = ( neg_sa[a][a_sa] | neg_sa[b][b_sa] | neg_sa[c][c_sa] );
老实说,我认为您是在为自己的困惑铺平道路。有 3 个独立的错误操作有什么好处?您如何确定发生了哪个错误?
真的只有两种可能:
- 要么你确实知道需要一个特定的分支并且它失败了(这是一个期望失败,你可以根据定义编码作为[=19中的一个期望点=]、
b
或c
). 或者您不知道隐含了哪个分支(例如,分支可以从类似的作品开始,但在其中失败了)。在那种情况下,没有人能知道应该调用哪个错误处理程序,所以有多个错误处理程序是无关紧要的。
实际上正确的做法是在更高级别上使
main_rule
失败,这意味着 "none of the possible branches succeeded"。这是
expect[ a | b | c ]
的处理方式。
好吧,冒着把太多东西混为一谈的风险,这里是:
namespace square::peg {
using namespace x3;
const auto quoted_string = lexeme['"' > *(print - '"') > '"'];
const auto bare_string = lexeme[alpha > *alnum] > ';';
const auto two_ints = int_ > int_;
const auto main = quoted_string | bare_string | two_ints;
const auto entry_point = skip(space)[ expect[main] > eoi ];
} // namespace square::peg
应该可以。关键是唯一该期待的东西 points 是使相应分支失败的原因 无疑是正确的分支。 (否则,字面上不会有 很难期望).
有两个次要的 get_info
更漂亮消息的专业化¹,这可能会导致
即使在手动捕获异常时也能显示正确的错误消息:
int main() {
using It = std::string::const_iterator;
for (std::string const input : {
" -89 0038 ",
" \"-89 0038\" ",
" something123123 ;",
// undecidable
"",
// violate expecations, no successful parse
" -89 oops ", // not an integer
" \"-89 0038 ", // missing "
" bareword ", // missing ;
// trailing debris, successful "main"
" -89 3.14 ", // followed by .14
})
{
std::cout << "====== " << std::quoted(input) << "\n";
It iter = input.begin(), end = input.end();
try {
if (parse(iter, end, square::peg::entry_point)) {
std::cout << "Parsed successfully\n";
} else {
std::cout << "Parsing failed\n";
}
} catch (x3::expectation_failure<It> const& ef) {
auto pos = std::distance(input.begin(), ef.where());
std::cout << "Expect " << ef.which() << " at "
<< "\n\t" << input
<< "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^\n";
}
}
}
版画
====== " -89 0038 "
Parsed successfully
====== " \"-89 0038\" "
Parsed successfully
====== " something123123 ;"
Parsed successfully
====== ""
Expect quoted string, bare string or integer number pair at
^
====== " -89 oops "
Expect integral number at
-89 oops
-------^
====== " \"-89 0038 "
Expect '"' at
"-89 0038
--------------^
====== " bareword "
Expect ';' at
bareword
------------^
====== " -89 3.14 "
Expect eoi at
-89 3.14
--------^
这已经超出了大多数人对解析器的期望。
但是:自动化,而且更灵活
我们可能不满足于只报告一个期望和救助。实际上,您可以报告并继续解析,因为只是存在常规不匹配:这就是 on_error
的用武之地。
让我们创建一个标签库:
struct with_error_handling {
template<typename It, typename Ctx>
x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const&) const {
std::string s(f,l);
auto pos = std::distance(f, ef.where());
std::cout << "Expecting " << ef.which() << " at "
<< "\n\t" << s
<< "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^\n";
return error_handler_result::fail;
}
};
现在,我们所要做的就是从 with_error_handling
和 BAM 中导出我们的规则 ID!我们不必编写任何异常处理程序,规则将简单地 "fail" 进行适当的诊断.更重要的是,一些输入可以导致多个(希望有用的)诊断:
auto const eh = [](auto p) {
struct _ : with_error_handling {};
return rule<_> {} = p;
};
const auto quoted_string = eh(lexeme['"' > *(print - '"') > '"']);
const auto bare_string = eh(lexeme[alpha > *alnum] > ';');
const auto two_ints = eh(int_ > int_);
const auto main = quoted_string | bare_string | two_ints;
using main_type = std::remove_cv_t<decltype(main)>;
const auto entry_point = skip(space)[ eh(expect[main] > eoi) ];
现在,main
变成:
for (std::string const input : {
" -89 0038 ",
" \"-89 0038\" ",
" something123123 ;",
// undecidable
"",
// violate expecations, no successful parse
" -89 oops ", // not an integer
" \"-89 0038 ", // missing "
" bareword ", // missing ;
// trailing debris, successful "main"
" -89 3.14 ", // followed by .14
})
{
std::cout << "====== " << std::quoted(input) << "\n";
It iter = input.begin(), end = input.end();
if (parse(iter, end, square::peg::entry_point)) {
std::cout << "Parsed successfully\n";
} else {
std::cout << "Parsing failed\n";
}
}
程序打印:
====== " -89 0038 "
Parsed successfully
====== " \"-89 0038\" "
Parsed successfully
====== " something123123 ;"
Parsed successfully
====== ""
Expecting quoted string, bare string or integer number pair at
^
Parsing failed
====== " -89 oops "
Expecting integral number at
-89 oops
-------^
Expecting quoted string, bare string or integer number pair at
-89 oops
^
Parsing failed
====== " \"-89 0038 "
Expecting '"' at
"-89 0038
--------------^
Expecting quoted string, bare string or integer number pair at
"-89 0038
^
Parsing failed
====== " bareword "
Expecting ';' at
bareword
------------^
Expecting quoted string, bare string or integer number pair at
bareword
^
Parsing failed
====== " -89 3.14 "
Expecting eoi at
-89 3.14
--------^
Parsing failed
属性传播,on_success
当解析器实际上不解析任何东西时,它们就不是很有用,所以让我们添加一些建设性的值处理,同时展示 on_success
:
定义一些 AST 类型来接收属性:
struct quoted : std::string {};
struct bare : std::string {};
using two_i = std::pair<int, int>;
using Value = boost::variant<quoted, bare, two_i>;
确保我们可以打印 Value
s:
static inline std::ostream& operator<<(std::ostream& os, Value const& v) {
struct {
std::ostream& _os;
void operator()(quoted const& v) const { _os << "quoted(" << std::quoted(v) << ")"; }
void operator()(bare const& v) const { _os << "bare(" << v << ")"; }
void operator()(two_i const& v) const { _os << "two_i(" << v.first << ", " << v.second << ")"; }
} vis{os};
boost::apply_visitor(vis, v);
return os;
}
现在,使用旧的 as<>
技巧来强制属性类型,这次使用 error-handling:
作为锦上添花,让我们在with_error_handling
中演示on_success
:
template<typename It, typename Ctx>
void on_success(It f, It l, two_i const& v, Ctx const&) const {
std::cout << "Parsed " << std::quoted(std::string(f,l)) << " as integer pair " << v.first << ", " << v.second << "\n";
}
现在主要未修改主程序(也只打印结果值):
It iter = input.begin(), end = input.end();
Value v;
if (parse(iter, end, square::peg::entry_point, v)) {
std::cout << "Result value: " << v << "\n";
} else {
std::cout << "Parsing failed\n";
}
版画
====== " -89 0038 "
Parsed "-89 0038" as integer pair -89, 38
Result value: two_i(-89, 38)
====== " \"-89 0038\" "
Result value: quoted("-89 0038")
====== " something123123 ;"
Result value: bare(something123123)
====== ""
Expecting quoted string, bare string or integer number pair at
^
Parsing failed
====== " -89 oops "
Expecting integral number at
-89 oops
-------^
Expecting quoted string, bare string or integer number pair at
-89 oops
^
Parsing failed
====== " \"-89 0038 "
Expecting '"' at
"-89 0038
--------------^
Expecting quoted string, bare string or integer number pair at
"-89 0038
^
Parsing failed
====== " bareword "
Expecting ';' at
bareword
------------^
Expecting quoted string, bare string or integer number pair at
bareword
^
Parsing failed
====== " -89 3.14 "
Parsed "-89 3" as integer pair -89, 3
Expecting eoi at
-89 3.14
--------^
Parsing failed
实在是太过分了
我不了解你,但我讨厌这样做 side-effects,更不用说从解析器打印到控制台了。让我们改用 x3::with
。
我们想通过 Ctx&
参数附加到诊断而不是写入
到 on_error
处理程序中的 std::cout
:
struct with_error_handling {
struct diags;
template<typename It, typename Ctx>
x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const& ctx) const {
std::string s(f,l);
auto pos = std::distance(f, ef.where());
std::ostringstream oss;
oss << "Expecting " << ef.which() << " at "
<< "\n\t" << s
<< "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^";
x3::get<diags>(ctx).push_back(oss.str());
return error_handler_result::fail;
}
};
并且在调用站点上,我们可以传递上下文:
std::vector<std::string> diags;
if (parse(iter, end, x3::with<D>(diags) [square::peg::entry_point], v)) {
std::cout << "Result value: " << v;
} else {
std::cout << "Parsing failed";
}
std::cout << " with " << diags.size() << " diagnostics messages: \n";
完整程序还打印诊断信息:
完整列表
//#define BOOST_SPIRIT_X3_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <iomanip>
namespace x3 = boost::spirit::x3;
struct quoted : std::string {};
struct bare : std::string {};
using two_i = std::pair<int, int>;
using Value = boost::variant<quoted, bare, two_i>;
static inline std::ostream& operator<<(std::ostream& os, Value const& v) {
struct {
std::ostream& _os;
void operator()(quoted const& v) const { _os << "quoted(" << std::quoted(v) << ")"; }
void operator()(bare const& v) const { _os << "bare(" << v << ")"; }
void operator()(two_i const& v) const { _os << "two_i(" << v.first << ", " << v.second << ")"; }
} vis{os};
boost::apply_visitor(vis, v);
return os;
}
namespace square::peg {
using namespace x3;
struct with_error_handling {
struct diags;
template<typename It, typename Ctx>
x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const& ctx) const {
std::string s(f,l);
auto pos = std::distance(f, ef.where());
std::ostringstream oss;
oss << "Expecting " << ef.which() << " at "
<< "\n\t" << s
<< "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^";
x3::get<diags>(ctx).push_back(oss.str());
return error_handler_result::fail;
}
};
template <typename T = x3::unused_type> auto const as = [](auto p) {
struct _ : with_error_handling {};
return rule<_, T> {} = p;
};
const auto quoted_string = as<quoted>(lexeme['"' > *(print - '"') > '"']);
const auto bare_string = as<bare>(lexeme[alpha > *alnum] > ';');
const auto two_ints = as<two_i>(int_ > int_);
const auto main = quoted_string | bare_string | two_ints;
using main_type = std::remove_cv_t<decltype(main)>;
const auto entry_point = skip(space)[ as<Value>(expect[main] > eoi) ];
} // namespace square::peg
namespace boost::spirit::x3 {
template <> struct get_info<int_type> {
typedef std::string result_type;
std::string operator()(int_type const&) const { return "integral number"; }
};
template <> struct get_info<square::peg::main_type> {
typedef std::string result_type;
std::string operator()(square::peg::main_type const&) const { return "quoted string, bare string or integer number pair"; }
};
}
int main() {
using It = std::string::const_iterator;
using D = square::peg::with_error_handling::diags;
for (std::string const input : {
" -89 0038 ",
" \"-89 0038\" ",
" something123123 ;",
// undecidable
"",
// violate expecations, no successful parse
" -89 oops ", // not an integer
" \"-89 0038 ", // missing "
" bareword ", // missing ;
// trailing debris, successful "main"
" -89 3.14 ", // followed by .14
})
{
std::cout << "====== " << std::quoted(input) << "\n";
It iter = input.begin(), end = input.end();
Value v;
std::vector<std::string> diags;
if (parse(iter, end, x3::with<D>(diags) [square::peg::entry_point], v)) {
std::cout << "Result value: " << v;
} else {
std::cout << "Parsing failed";
}
std::cout << " with " << diags.size() << " diagnostics messages: \n";
for(auto& msg: diags) {
std::cout << " - " << msg << "\n";
}
}
}
¹ 您可以改用带有名称的规则,避免这个更复杂的技巧
² 在旧版本的库中,您可能需要努力获取 with<>
数据的引用语义:Live On Coliru