boost spirit x3 语法中匹配的错误规则
Incorrect rule matched in boost spirit x3 grammar
我是精灵的小白
我正在尝试使用 spirit x3 从简单的“excel”公式构建 AST 树。语法支持典型运算符(+、-、*、/)、函数(myfunc(myparam1, myparam2))和单元格引用(例如 A1、AA234)。
因此,要解析的示例表达式可能是 A1 + sin(A2+3)。
问题是下面的 xlreference 规则永远不会匹配,因为 xlfunction 规则优先并且该规则不会回溯。我已经尝试过 expect,但我缺少一些很好的例子来实现它。
我想这会引出另一个问题,即调试 x3 的最佳方式是什么。我看过 BOOST_SPIRIT_X3_DEBUG 定义,但找不到任何示例来证明它的用法。我还在 expression_class 上写了一个 on_error 方法,但这并没有提供很好的跟踪。我尝试使用位置标记和 with 语句,但这也没有提供足够的信息。
如有任何帮助,我们将不胜感激!
x3::rule<class xlreference, ast::xlreference> const xlreference{"xlreference"};
auto const xlreference_def = +alpha > x3::uint_ > !x3::expect[char('(')];
BOOST_SPIRIT_DEFINE(xlreference);
struct identifier_class;
typedef x3::rule<identifier_class, std::string> identifier_type;
identifier_type const identifier = "identifier";
auto const identifier_def = x3::char_("a-zA-Z") > *(x3::char_("a-zA-Z") | x3::char_('_')) > !x3::expect[char('0-9')];
BOOST_SPIRIT_DEFINE(identifier);
auto const expression_def = // constadditive_expr_def
term [print_action()]
>> *( (char_('+') > term)
| (char_('-') > term)
)
;
x3::rule<xlfunction_class, ast::xlfunction> const xlfunction("xlfunction");
auto const xlfunction_def = identifier > '(' > *(expression > *(',' > expression)) > ')';
BOOST_SPIRIT_DEFINE(xlfunction);
auto const term_def = //constmultiplicative_expr_def
factor
>> *( (char_('*') > factor)
| (char_('/') > factor)
)
;
auto const factor_def = // constunary_expr_def
xlfunction [print_action()]
| '(' > expression > ')'
| (char_('-') > factor)
| (char_('+') > factor)
| x3::double_ [print_action()] | xlreference [print_action()]
;
错误处理程序:
struct expression_class //: x3::annotate_on_success
{
// Our error handler
template <typename Iterator, typename Exception, typename Context>
x3::error_handler_result
on_error(Iterator& q, Iterator const& last, Exception const& x, Context const& context)
{
std::cout
<< "Error! Expecting: "
<< x.which()
<< " here: \""
<< std::string(x.where(), last)
<< "\""
<< std::endl
;
return x3::error_handler_result::fail;
}
};
位置标签:
with<position_cache_tag>(std::ref(positions))
[
client::calculator_grammar::expression
];
client::ast::program ast;
bool r = phrase_parse(iter, (iterator_type const ) str.end(), parser, x3::space, ast);
if (!r) {
std::cout << "failed:" << str << "\n";
}
好的,一步一个脚印。一路编造 AST 类型(因为你不关心显示)。
这是无效代码:char('0-9')
。那是一个宽字符文字吗?启用编译器警告!您的意思可能是 x3::char_("0-9")
(两个重要的区别!)。
!x3::expect[]
是矛盾的。您永远无法通过该条件,因为 !
断言前瞻不匹配,而 expect[]
需要匹配。所以,最好的情况 !
失败了,因为 expect[]
-ation 被匹配了。最坏的情况 expect[]
会抛出异常,因为你要求它抛出异常。
operator >
已经是一个期望点了。和前面一样,> !p
是矛盾的。做到 >> !p
将char_("0-9")
替换为x3::digit
将char_("a-zA-Z")
替换为x3::alpha
一些(很多)规则需要是词素。那是因为您在船长上下文中调用语法(phrase_parse
和 x3::space
)。你的标识符会默默地吃掉空格,因为你没有让它们成为词位。参见 Boost spirit skipper issues
否定先行断言不会公开属性,因此 ! char_('(')
可以(应该?)是 ! lit('(')
存在语义操作(默认情况下)会抑制属性传播 - 因此 print_action()
将导致属性传播停止
根据定义,期望点 (operator>
) 无法回溯。这就是他们 期望值 .
的原因
使用kleene-star组成的List运算符:p >> *(',' >> p)
-> p % ','
那个额外的 kleene-star 是假的。您的意思是使参数列表可选吗?那是 -(expression % ',')
链接运算符规则使得获得正确的 ast 有点麻烦
简化
factor >> *((x3::char_('*') > factor) //
| (x3::char_('/') > factor));
刚好factor >> *(x3::char_("*/") >> factor);
factor_def
在逻辑上与 expression
匹配?
第一轮审查通过收益率:
auto const xlreference_def =
x3::lexeme[+x3::alpha >> x3::uint_] >> !x3::char_('(');
auto const identifier_def =
x3::raw[x3::lexeme[x3::alpha >> *(x3::alpha | '_') >> !x3::digit]];
auto const xlfunction_def = identifier >> '(' >> -(expression % ',') >> ')';
auto const term_def = factor >> *(x3::char_("*/") >> factor);
auto const factor_def = xlfunction //
| '(' >> expression >> ')' //
| x3::double_ //
| xlreference;
auto const expression_def = term >> *(x3::char_("-+") >> term);
更多观察:
(iterator_type const)str.end()
??永远不要使用 C 风格的转换。事实上,只要使用 str.cend()
或 str.end()
如果 str
无论如何都适合 const
。
phrase_parse
- 考虑不要让船长成为呼叫者的决定,因为它在逻辑上是你语法的一部分
多种excel表达式不解析:A:A、$A4、B$4、所有单元格区域;我想经常R1C1也支持
魔法仙尘的时间
因此,凭借丰富的经验,我打算使用 Crystall Ball™ 一些 AST®:
namespace client::ast {
using identifier = std::string;
//using identifier = boost::iterator_range<std::string::const_iterator>;
struct string_literal : std::string {
using std::string::string;
using std::string::operator=;
friend std::ostream& operator<<(std::ostream& os, string_literal const& sl) {
return os << std::quoted(sl) ;
}
};
struct xlreference {
std::string colname;
size_t rownum;
};
struct xlfunction; // fwd
struct binary_op; // fwd
using expression = boost::variant< //
double, //
string_literal, //
identifier, //
xlreference, //
boost::recursive_wrapper<xlfunction>, //
boost::recursive_wrapper<binary_op> //
>;
struct xlfunction{
identifier name;
std::vector<expression> args;
friend std::ostream& operator<<(std::ostream& os, xlfunction const& xlf)
{
os << xlf.name << "(";
char const* sep = "";
for (auto& arg : xlf.args)
os << std::exchange(sep, ", ") << arg;
return os;
}
};
struct binary_op {
struct chained_t {
char op;
expression e;
};
expression lhs;
std::vector<chained_t> chained;
friend std::ostream& operator<<(std::ostream& os, binary_op const& bop)
{
os << "(" << bop.lhs;
for (auto& rhs : bop.chained)
os << rhs.op << rhs.e;
return os << ")";
}
};
using program = expression;
using boost::fusion::operator<<;
} // namespace client::ast
我们及时调整:
BOOST_FUSION_ADAPT_STRUCT(client::ast::xlfunction, name, args)
BOOST_FUSION_ADAPT_STRUCT(client::ast::xlreference, colname, rownum)
BOOST_FUSION_ADAPT_STRUCT(client::ast::binary_op, lhs, chained)
BOOST_FUSION_ADAPT_STRUCT(client::ast::binary_op::chained_t, op, e)
接下来,让我们声明合适的规则:
x3::rule<struct identifier_class, ast::identifier> const identifier{"identifier"};
x3::rule<struct xlreference, ast::xlreference> const xlreference{"xlreference"};
x3::rule<struct xlfunction_class, ast::xlfunction> const xlfunction{"xlfunction"};
x3::rule<struct factor_class, ast::expression> const factor{"factor"};
x3::rule<struct expression_class, ast::binary_op> const expression{"expression"};
x3::rule<struct term_class, ast::binary_op> const term{"term"};
哪些需要定义:
auto const xlreference_def =
x3::lexeme[+x3::alpha >> x3::uint_] /*>> !x3::char_('(')*/;
Note how the look-ahead assertion (!
) doesn't actually change the parse result, as any cell reference isn't a valid identifier anyways, so the () would remain unparsed.
auto const identifier_def =
x3::raw[x3::lexeme[x3::alpha >> *(x3::alpha | '_') /*>> !x3::digit*/]];
Same here. Remaining input will be checked against with x3::eoi
later.
我输入了一个字符串文字,因为任何 Excel 克隆都会有一个:
auto const string_literal =
x3::rule<struct _, ast::string_literal>{"string_literal"} //
= x3::lexeme['"' > *('\' >> x3::char_ | ~x3::char_('"')) > '"'];
Note that this demonstrates that non-recursive, locally-defined rules
don't need separate definitions.
然后是表达式规则
auto const factor_def = //
xlfunction //
| '(' >> expression >> ')' //
| x3::double_ //
| string_literal //
| xlreference //
| identifier //
;
I'd usually call this "simple expression" instead of factor.
auto const term_def = factor >> *(x3::char_("*/") >> factor);
auto const expression_def = term >> *(x3::char_("-+") >> term);
auto const xlfunction_def = identifier >> '(' >> -(expression % ',') >> ')';
直接映射到 AST。
BOOST_SPIRIT_DEFINE(xlreference)
BOOST_SPIRIT_DEFINE(identifier)
BOOST_SPIRIT_DEFINE(xlfunction)
BOOST_SPIRIT_DEFINE(term)
BOOST_SPIRIT_DEFINE(factor)
BOOST_SPIRIT_DEFINE(expression)
'纳夫说。现在出现了一些货物崇拜 - 未显示和未使用的代码残余,我在这里大多只是接受和忽略:
int main() {
std::vector<int> positions; // TODO
auto parser = x3::with<struct position_cache_tag /*TODO*/> //
(std::ref(positions)) //
[ //
x3::skip(x3::space)[ //
client::calculator_grammar::expression >> x3::eoi //
] //
];
DO NOTE though that x3::eoi
makes it so the rule doesn't match if end of input (modulo skipper) isn't reached.
现在,让我们添加一些测试用例!
struct {
std::string category;
std::vector<std::string> cases;
} test_table[] = {
{
"xlreference",
{"A1", "A1111", "AbCdZ9876543", "i9", "i0"},
},
{
"identifier",
{"i", "id", "id_entifier"},
},
{
"number",
{"123", "inf", "-inf", "NaN", ".99e34", "1e-8", "1.e-8", "+9"},
},
{
"binaries",
{ //
"3+4", "3*4", //
"3+4+5", "3*4*5", "3+4*5", "3*4+5", "3*4+5", //
"3+(4+5)", "3*(4*5)", "3+(4*5)", "3*(4+5)", "3*(4+5)", //
"(3+4)+5", "(3*4)*5", "(3+4)*5", "(3*4)+5", "(3*4)+5"},
},
{
"xlfunction",
{
"pi()",
"sin(4)",
R"--(IIF(A1, "Red", "Green"))--",
},
},
{
"invalid",
{
"A9()", // an xlreference may not be followed by ()
"", // you didn't specify
},
},
{
"other",
{
"A-9", // 1-letter identifier and binary operation
"1 + +9", // unary plus accepted in number rule
},
},
{
"question",
{
"myfunc(myparam1, myparam2)",
"A1",
"AA234",
"A1 + sin(A2+3)",
},
},
};
并且运行他们:
for (auto& [cat, cases] : test_table) {
for (std::string const& str : cases) {
auto iter = begin(str), last(end(str));
std::cout << std::setw(12) << cat << ": ";
client::ast::program ast;
if (parse(iter, last, parser, ast)) {
std::cout << "parsed: " << ast;
} else {
std::cout << "failed: " << std::quoted(str);
}
if (iter == last) {
std::cout << "\n";
} else {
std::cout << " unparsed: "
<< std::quoted(std::string_view(iter, last)) << "\n";
}
}
}
现场演示
//#define BOOST_SPIRIT_X3_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/position_tagged.hpp>
#include <boost/spirit/home/x3/support/utility/annotate_on_success.hpp>
#include <iostream>
#include <iomanip>
#include <map>
namespace x3 = boost::spirit::x3;
namespace client::ast {
using identifier = std::string;
//using identifier = boost::iterator_range<std::string::const_iterator>;
struct string_literal : std::string {
using std::string::string;
using std::string::operator=;
friend std::ostream& operator<<(std::ostream& os, string_literal const& sl) {
return os << std::quoted(sl) ;
}
};
struct xlreference {
std::string colname;
size_t rownum;
};
struct xlfunction; // fwd
struct binary_op; // fwd
using expression = boost::variant< //
double, //
string_literal, //
identifier, //
xlreference, //
boost::recursive_wrapper<xlfunction>, //
boost::recursive_wrapper<binary_op> //
>;
struct xlfunction{
identifier name;
std::vector<expression> args;
friend std::ostream& operator<<(std::ostream& os, xlfunction const& xlf)
{
os << xlf.name << "(";
char const* sep = "";
for (auto& arg : xlf.args)
os << std::exchange(sep, ", ") << arg;
return os;
}
};
struct binary_op {
struct chained_t {
char op;
expression e;
};
expression lhs;
std::vector<chained_t> chained;
friend std::ostream& operator<<(std::ostream& os, binary_op const& bop)
{
os << "(" << bop.lhs;
for (auto& rhs : bop.chained)
os << rhs.op << rhs.e;
return os << ")";
}
};
using program = expression;
using boost::fusion::operator<<;
} // namespace client::ast
BOOST_FUSION_ADAPT_STRUCT(client::ast::xlfunction, name, args)
BOOST_FUSION_ADAPT_STRUCT(client::ast::xlreference, colname, rownum)
BOOST_FUSION_ADAPT_STRUCT(client::ast::binary_op, lhs, chained)
BOOST_FUSION_ADAPT_STRUCT(client::ast::binary_op::chained_t, op, e)
namespace client::calculator_grammar {
struct expression_class //: x3::annotate_on_success
{
// Our error handler
template <typename Iterator, typename Exception, typename Context>
x3::error_handler_result on_error(Iterator& q, Iterator const& last,
Exception const& x,
Context const& context)
{
std::cout //
<< "Error! Expecting: " << x.which() //
<< " here: \"" << std::string(x.where(), last) //
<< "\"" << std::endl;
return x3::error_handler_result::fail;
}
};
x3::rule<struct identifier_class, ast::identifier> const identifier{"identifier"};
x3::rule<struct xlreference, ast::xlreference> const xlreference{"xlreference"};
x3::rule<struct xlfunction_class, ast::xlfunction> const xlfunction{"xlfunction"};
x3::rule<struct factor_class, ast::expression> const factor{"factor"};
x3::rule<struct expression_class, ast::binary_op> const expression{"expression"};
x3::rule<struct term_class, ast::binary_op> const term{"term"};
auto const xlreference_def =
x3::lexeme[+x3::alpha >> x3::uint_] /*>> !x3::char_('(')*/;
auto const identifier_def =
x3::raw[x3::lexeme[x3::alpha >> *(x3::alpha | '_') /*>> !x3::digit*/]];
auto const string_literal =
x3::rule<struct _, ast::string_literal>{"string_literal"} //
= x3::lexeme['"' > *('\' >> x3::char_ | ~x3::char_('"')) > '"'];
auto const factor_def = //
xlfunction //
| '(' >> expression >> ')' //
| x3::double_ //
| string_literal //
| xlreference //
| identifier //
;
auto const term_def = factor >> *(x3::char_("*/") >> factor);
auto const expression_def = term >> *(x3::char_("-+") >> term);
auto const xlfunction_def = identifier >> '(' >> -(expression % ',') >> ')';
BOOST_SPIRIT_DEFINE(xlreference)
BOOST_SPIRIT_DEFINE(identifier)
BOOST_SPIRIT_DEFINE(xlfunction)
BOOST_SPIRIT_DEFINE(term)
BOOST_SPIRIT_DEFINE(factor)
BOOST_SPIRIT_DEFINE(expression)
} // namespace client::calculator_grammar
int main() {
std::vector<int> positions; // TODO
auto parser = x3::with<struct position_cache_tag /*TODO*/> //
(std::ref(positions)) //
[ //
x3::skip(x3::space)[ //
client::calculator_grammar::expression >> x3::eoi //
] //
];
struct {
std::string category;
std::vector<std::string> cases;
} test_table[] = {
{
"xlreference",
{"A1", "A1111", "AbCdZ9876543", "i9", "i0"},
},
{
"identifier",
{"i", "id", "id_entifier"},
},
{
"number",
{"123", "inf", "-inf", "NaN", ".99e34", "1e-8", "1.e-8", "+9"},
},
{
"binaries",
{ //
"3+4", "3*4", //
"3+4+5", "3*4*5", "3+4*5", "3*4+5", "3*4+5", //
"3+(4+5)", "3*(4*5)", "3+(4*5)", "3*(4+5)", "3*(4+5)", //
"(3+4)+5", "(3*4)*5", "(3+4)*5", "(3*4)+5", "(3*4)+5"},
},
{
"xlfunction",
{
"pi()",
"sin(4)",
R"--(IIF(A1, "Red", "Green"))--",
},
},
{
"invalid",
{
"A9()", // an xlreference may not be followed by ()
"", // you didn't specify
},
},
{
"other",
{
"A-9", // 1-letter identifier and binary operation
"1 + +9", // unary plus accepted in number rule
},
},
{
"question",
{
"myfunc(myparam1, myparam2)",
"A1",
"AA234",
"A1 + sin(A2+3)",
},
},
};
for (auto& [cat, cases] : test_table) {
for (std::string const& str : cases) {
auto iter = begin(str), last(end(str));
std::cout << std::setw(12) << cat << ": ";
client::ast::program ast;
if (parse(iter, last, parser, ast)) {
std::cout << "parsed: " << ast;
} else {
std::cout << "failed: " << std::quoted(str);
}
if (iter == last) {
std::cout << "\n";
} else {
std::cout << " unparsed: "
<< std::quoted(std::string_view(iter, last)) << "\n";
}
}
}
}
版画
xlreference: parsed: (((A 1)))
xlreference: parsed: (((A 1111)))
xlreference: parsed: (((AbCdZ 9876543)))
xlreference: parsed: (((i 9)))
xlreference: parsed: (((i 0)))
identifier: parsed: ((i))
identifier: parsed: ((id))
identifier: parsed: ((id_entifier))
number: parsed: ((123))
number: parsed: ((inf))
number: parsed: ((-inf))
number: parsed: ((nan))
number: parsed: ((9.9e+33))
number: parsed: ((1e-08))
number: parsed: ((1e-08))
number: parsed: ((9))
binaries: parsed: ((3)+(4))
binaries: parsed: ((3*4))
binaries: parsed: ((3)+(4)+(5))
binaries: parsed: ((3*4*5))
binaries: parsed: ((3)+(4*5))
binaries: parsed: ((3*4)+(5))
binaries: parsed: ((3*4)+(5))
binaries: parsed: ((3)+(((4)+(5))))
binaries: parsed: ((3*((4*5))))
binaries: parsed: ((3)+(((4*5))))
binaries: parsed: ((3*((4)+(5))))
binaries: parsed: ((3*((4)+(5))))
binaries: parsed: ((((3)+(4)))+(5))
binaries: parsed: ((((3*4))*5))
binaries: parsed: ((((3)+(4))*5))
binaries: parsed: ((((3*4)))+(5))
binaries: parsed: ((((3*4)))+(5))
xlfunction: parsed: ((pi())
xlfunction: parsed: ((sin(((4))))
xlfunction: parsed: ((IIF((((A 1))), (("Red")), (("Green"))))
invalid: failed: "A9()" unparsed: "A9()"
invalid: failed: ""
other: parsed: ((A)-(9))
other: parsed: ((1)+(9))
question: parsed: ((myfunc((((myparam 1))), (((myparam 2)))))
question: parsed: (((A 1)))
question: parsed: (((AA 234)))
question: parsed: (((A 1))+(sin((((A 2))+(3))))
仅有的两行 failed
符合预期
调试?
直接取消注释
#define BOOST_SPIRIT_X3_DEBUG
并被额外的噪音猛击:
question: <expression>
<try>A1 + sin(A2+3)</try>
<term>
<try>A1 + sin(A2+3)</try>
<factor>
<try>A1 + sin(A2+3)</try>
<xlfunction>
<try>A1 + sin(A2+3)</try>
<identifier>
<try>A1 + sin(A2+3)</try>
<success>1 + sin(A2+3)</success>
<attributes>[A]</attributes>
</identifier>
<fail/>
</xlfunction>
<string_literal>
<try>A1 + sin(A2+3)</try>
<fail/>
</string_literal>
<xlreference>
<try>A1 + sin(A2+3)</try>
<success> + sin(A2+3)</success>
<attributes>[[A], 1]</attributes>
</xlreference>
<success> + sin(A2+3)</success>
<attributes>[[A], 1]</attributes>
</factor>
<success> + sin(A2+3)</success>
<attributes>[[[A], 1], []]</attributes>
</term>
<term>
<try> sin(A2+3)</try>
<factor>
<try> sin(A2+3)</try>
<xlfunction>
<try> sin(A2+3)</try>
<identifier>
<try> sin(A2+3)</try>
<success>(A2+3)</success>
<attributes>[s, i, n]</attributes>
</identifier>
<expression>
<try>A2+3)</try>
<term>
<try>A2+3)</try>
<factor>
<try>A2+3)</try>
<xlfunction>
<try>A2+3)</try>
<identifier>
<try>A2+3)</try>
<success>2+3)</success>
<attributes>[A]</attributes>
</identifier>
<fail/>
</xlfunction>
<string_literal>
<try>A2+3)</try>
<fail/>
</string_literal>
<xlreference>
<try>A2+3)</try>
<success>+3)</success>
<attributes>[[A], 2]</attributes>
</xlreference>
<success>+3)</success>
<attributes>[[A], 2]</attributes>
</factor>
<success>+3)</success>
<attributes>[[[A], 2], []]</attributes>
</term>
<term>
<try>3)</try>
<factor>
<try>3)</try>
<xlfunction>
<try>3)</try>
<identifier>
<try>3)</try>
<fail/>
</identifier>
<fail/>
</xlfunction>
<success>)</success>
<attributes>3</attributes>
</factor>
<success>)</success>
<attributes>[3, []]</attributes>
</term>
<success>)</success>
<attributes>[[[[A], 2], []], [[+, [3, []]]]]</attributes>
</expression>
<success></success>
<attributes>[[s, i, n], [[[[[A], 2], []], [[+, [3, []]]]]]]</attributes>
</xlfunction>
<success></success>
<attributes>[[s, i, n], [[[[[A], 2], []], [[+, [3, []]]]]]]</attributes>
</factor>
<success></success>
<attributes>[[[s, i, n], [[[[[A], 2], []], [[+, [3, []]]]]]], []]</attributes>
</term>
<success></success>
<attributes>[[[[A], 1], []], [[+, [[[s, i, n], [[[[[A], 2], []], [[+, [3, []]]]]]], []]]]]</attributes>
</expression>
parsed: (((A 1))+(sin((((A 2))+(3))))
我是精灵的小白
我正在尝试使用 spirit x3 从简单的“excel”公式构建 AST 树。语法支持典型运算符(+、-、*、/)、函数(myfunc(myparam1, myparam2))和单元格引用(例如 A1、AA234)。
因此,要解析的示例表达式可能是 A1 + sin(A2+3)。
问题是下面的 xlreference 规则永远不会匹配,因为 xlfunction 规则优先并且该规则不会回溯。我已经尝试过 expect,但我缺少一些很好的例子来实现它。
我想这会引出另一个问题,即调试 x3 的最佳方式是什么。我看过 BOOST_SPIRIT_X3_DEBUG 定义,但找不到任何示例来证明它的用法。我还在 expression_class 上写了一个 on_error 方法,但这并没有提供很好的跟踪。我尝试使用位置标记和 with 语句,但这也没有提供足够的信息。
如有任何帮助,我们将不胜感激!
x3::rule<class xlreference, ast::xlreference> const xlreference{"xlreference"};
auto const xlreference_def = +alpha > x3::uint_ > !x3::expect[char('(')];
BOOST_SPIRIT_DEFINE(xlreference);
struct identifier_class;
typedef x3::rule<identifier_class, std::string> identifier_type;
identifier_type const identifier = "identifier";
auto const identifier_def = x3::char_("a-zA-Z") > *(x3::char_("a-zA-Z") | x3::char_('_')) > !x3::expect[char('0-9')];
BOOST_SPIRIT_DEFINE(identifier);
auto const expression_def = // constadditive_expr_def
term [print_action()]
>> *( (char_('+') > term)
| (char_('-') > term)
)
;
x3::rule<xlfunction_class, ast::xlfunction> const xlfunction("xlfunction");
auto const xlfunction_def = identifier > '(' > *(expression > *(',' > expression)) > ')';
BOOST_SPIRIT_DEFINE(xlfunction);
auto const term_def = //constmultiplicative_expr_def
factor
>> *( (char_('*') > factor)
| (char_('/') > factor)
)
;
auto const factor_def = // constunary_expr_def
xlfunction [print_action()]
| '(' > expression > ')'
| (char_('-') > factor)
| (char_('+') > factor)
| x3::double_ [print_action()] | xlreference [print_action()]
;
错误处理程序:
struct expression_class //: x3::annotate_on_success
{
// Our error handler
template <typename Iterator, typename Exception, typename Context>
x3::error_handler_result
on_error(Iterator& q, Iterator const& last, Exception const& x, Context const& context)
{
std::cout
<< "Error! Expecting: "
<< x.which()
<< " here: \""
<< std::string(x.where(), last)
<< "\""
<< std::endl
;
return x3::error_handler_result::fail;
}
};
位置标签:
with<position_cache_tag>(std::ref(positions))
[
client::calculator_grammar::expression
];
client::ast::program ast;
bool r = phrase_parse(iter, (iterator_type const ) str.end(), parser, x3::space, ast);
if (!r) {
std::cout << "failed:" << str << "\n";
}
好的,一步一个脚印。一路编造 AST 类型(因为你不关心显示)。
这是无效代码:
char('0-9')
。那是一个宽字符文字吗?启用编译器警告!您的意思可能是x3::char_("0-9")
(两个重要的区别!)。!x3::expect[]
是矛盾的。您永远无法通过该条件,因为!
断言前瞻不匹配,而expect[]
需要匹配。所以,最好的情况!
失败了,因为expect[]
-ation 被匹配了。最坏的情况expect[]
会抛出异常,因为你要求它抛出异常。operator >
已经是一个期望点了。和前面一样,> !p
是矛盾的。做到>> !p
将
char_("0-9")
替换为x3::digit
将
char_("a-zA-Z")
替换为x3::alpha
一些(很多)规则需要是词素。那是因为您在船长上下文中调用语法(
phrase_parse
和x3::space
)。你的标识符会默默地吃掉空格,因为你没有让它们成为词位。参见 Boost spirit skipper issues否定先行断言不会公开属性,因此
! char_('(')
可以(应该?)是! lit('(')
存在语义操作(默认情况下)会抑制属性传播 - 因此
print_action()
将导致属性传播停止根据定义,期望点 (
的原因operator>
) 无法回溯。这就是他们 期望值 .使用kleene-star组成的List运算符:
p >> *(',' >> p)
->p % ','
那个额外的 kleene-star 是假的。您的意思是使参数列表可选吗?那是
-(expression % ',')
链接运算符规则使得获得正确的 ast 有点麻烦
简化
factor >> *((x3::char_('*') > factor) // | (x3::char_('/') > factor));
刚好
factor >> *(x3::char_("*/") >> factor);
factor_def
在逻辑上与expression
匹配?
第一轮审查通过收益率:
auto const xlreference_def =
x3::lexeme[+x3::alpha >> x3::uint_] >> !x3::char_('(');
auto const identifier_def =
x3::raw[x3::lexeme[x3::alpha >> *(x3::alpha | '_') >> !x3::digit]];
auto const xlfunction_def = identifier >> '(' >> -(expression % ',') >> ')';
auto const term_def = factor >> *(x3::char_("*/") >> factor);
auto const factor_def = xlfunction //
| '(' >> expression >> ')' //
| x3::double_ //
| xlreference;
auto const expression_def = term >> *(x3::char_("-+") >> term);
更多观察:
(iterator_type const)str.end()
??永远不要使用 C 风格的转换。事实上,只要使用str.cend()
或str.end()
如果str
无论如何都适合const
。phrase_parse
- 考虑不要让船长成为呼叫者的决定,因为它在逻辑上是你语法的一部分多种excel表达式不解析:A:A、$A4、B$4、所有单元格区域;我想经常R1C1也支持
魔法仙尘的时间
因此,凭借丰富的经验,我打算使用 Crystall Ball™ 一些 AST®:
namespace client::ast {
using identifier = std::string;
//using identifier = boost::iterator_range<std::string::const_iterator>;
struct string_literal : std::string {
using std::string::string;
using std::string::operator=;
friend std::ostream& operator<<(std::ostream& os, string_literal const& sl) {
return os << std::quoted(sl) ;
}
};
struct xlreference {
std::string colname;
size_t rownum;
};
struct xlfunction; // fwd
struct binary_op; // fwd
using expression = boost::variant< //
double, //
string_literal, //
identifier, //
xlreference, //
boost::recursive_wrapper<xlfunction>, //
boost::recursive_wrapper<binary_op> //
>;
struct xlfunction{
identifier name;
std::vector<expression> args;
friend std::ostream& operator<<(std::ostream& os, xlfunction const& xlf)
{
os << xlf.name << "(";
char const* sep = "";
for (auto& arg : xlf.args)
os << std::exchange(sep, ", ") << arg;
return os;
}
};
struct binary_op {
struct chained_t {
char op;
expression e;
};
expression lhs;
std::vector<chained_t> chained;
friend std::ostream& operator<<(std::ostream& os, binary_op const& bop)
{
os << "(" << bop.lhs;
for (auto& rhs : bop.chained)
os << rhs.op << rhs.e;
return os << ")";
}
};
using program = expression;
using boost::fusion::operator<<;
} // namespace client::ast
我们及时调整:
BOOST_FUSION_ADAPT_STRUCT(client::ast::xlfunction, name, args)
BOOST_FUSION_ADAPT_STRUCT(client::ast::xlreference, colname, rownum)
BOOST_FUSION_ADAPT_STRUCT(client::ast::binary_op, lhs, chained)
BOOST_FUSION_ADAPT_STRUCT(client::ast::binary_op::chained_t, op, e)
接下来,让我们声明合适的规则:
x3::rule<struct identifier_class, ast::identifier> const identifier{"identifier"};
x3::rule<struct xlreference, ast::xlreference> const xlreference{"xlreference"};
x3::rule<struct xlfunction_class, ast::xlfunction> const xlfunction{"xlfunction"};
x3::rule<struct factor_class, ast::expression> const factor{"factor"};
x3::rule<struct expression_class, ast::binary_op> const expression{"expression"};
x3::rule<struct term_class, ast::binary_op> const term{"term"};
哪些需要定义:
auto const xlreference_def =
x3::lexeme[+x3::alpha >> x3::uint_] /*>> !x3::char_('(')*/;
Note how the look-ahead assertion (
!
) doesn't actually change the parse result, as any cell reference isn't a valid identifier anyways, so the () would remain unparsed.
auto const identifier_def =
x3::raw[x3::lexeme[x3::alpha >> *(x3::alpha | '_') /*>> !x3::digit*/]];
Same here. Remaining input will be checked against with
x3::eoi
later.
我输入了一个字符串文字,因为任何 Excel 克隆都会有一个:
auto const string_literal =
x3::rule<struct _, ast::string_literal>{"string_literal"} //
= x3::lexeme['"' > *('\' >> x3::char_ | ~x3::char_('"')) > '"'];
Note that this demonstrates that non-recursive, locally-defined rules don't need separate definitions.
然后是表达式规则
auto const factor_def = //
xlfunction //
| '(' >> expression >> ')' //
| x3::double_ //
| string_literal //
| xlreference //
| identifier //
;
I'd usually call this "simple expression" instead of factor.
auto const term_def = factor >> *(x3::char_("*/") >> factor);
auto const expression_def = term >> *(x3::char_("-+") >> term);
auto const xlfunction_def = identifier >> '(' >> -(expression % ',') >> ')';
直接映射到 AST。
BOOST_SPIRIT_DEFINE(xlreference)
BOOST_SPIRIT_DEFINE(identifier)
BOOST_SPIRIT_DEFINE(xlfunction)
BOOST_SPIRIT_DEFINE(term)
BOOST_SPIRIT_DEFINE(factor)
BOOST_SPIRIT_DEFINE(expression)
'纳夫说。现在出现了一些货物崇拜 - 未显示和未使用的代码残余,我在这里大多只是接受和忽略:
int main() {
std::vector<int> positions; // TODO
auto parser = x3::with<struct position_cache_tag /*TODO*/> //
(std::ref(positions)) //
[ //
x3::skip(x3::space)[ //
client::calculator_grammar::expression >> x3::eoi //
] //
];
DO NOTE though that
x3::eoi
makes it so the rule doesn't match if end of input (modulo skipper) isn't reached.
现在,让我们添加一些测试用例!
struct {
std::string category;
std::vector<std::string> cases;
} test_table[] = {
{
"xlreference",
{"A1", "A1111", "AbCdZ9876543", "i9", "i0"},
},
{
"identifier",
{"i", "id", "id_entifier"},
},
{
"number",
{"123", "inf", "-inf", "NaN", ".99e34", "1e-8", "1.e-8", "+9"},
},
{
"binaries",
{ //
"3+4", "3*4", //
"3+4+5", "3*4*5", "3+4*5", "3*4+5", "3*4+5", //
"3+(4+5)", "3*(4*5)", "3+(4*5)", "3*(4+5)", "3*(4+5)", //
"(3+4)+5", "(3*4)*5", "(3+4)*5", "(3*4)+5", "(3*4)+5"},
},
{
"xlfunction",
{
"pi()",
"sin(4)",
R"--(IIF(A1, "Red", "Green"))--",
},
},
{
"invalid",
{
"A9()", // an xlreference may not be followed by ()
"", // you didn't specify
},
},
{
"other",
{
"A-9", // 1-letter identifier and binary operation
"1 + +9", // unary plus accepted in number rule
},
},
{
"question",
{
"myfunc(myparam1, myparam2)",
"A1",
"AA234",
"A1 + sin(A2+3)",
},
},
};
并且运行他们:
for (auto& [cat, cases] : test_table) {
for (std::string const& str : cases) {
auto iter = begin(str), last(end(str));
std::cout << std::setw(12) << cat << ": ";
client::ast::program ast;
if (parse(iter, last, parser, ast)) {
std::cout << "parsed: " << ast;
} else {
std::cout << "failed: " << std::quoted(str);
}
if (iter == last) {
std::cout << "\n";
} else {
std::cout << " unparsed: "
<< std::quoted(std::string_view(iter, last)) << "\n";
}
}
}
现场演示
//#define BOOST_SPIRIT_X3_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/position_tagged.hpp>
#include <boost/spirit/home/x3/support/utility/annotate_on_success.hpp>
#include <iostream>
#include <iomanip>
#include <map>
namespace x3 = boost::spirit::x3;
namespace client::ast {
using identifier = std::string;
//using identifier = boost::iterator_range<std::string::const_iterator>;
struct string_literal : std::string {
using std::string::string;
using std::string::operator=;
friend std::ostream& operator<<(std::ostream& os, string_literal const& sl) {
return os << std::quoted(sl) ;
}
};
struct xlreference {
std::string colname;
size_t rownum;
};
struct xlfunction; // fwd
struct binary_op; // fwd
using expression = boost::variant< //
double, //
string_literal, //
identifier, //
xlreference, //
boost::recursive_wrapper<xlfunction>, //
boost::recursive_wrapper<binary_op> //
>;
struct xlfunction{
identifier name;
std::vector<expression> args;
friend std::ostream& operator<<(std::ostream& os, xlfunction const& xlf)
{
os << xlf.name << "(";
char const* sep = "";
for (auto& arg : xlf.args)
os << std::exchange(sep, ", ") << arg;
return os;
}
};
struct binary_op {
struct chained_t {
char op;
expression e;
};
expression lhs;
std::vector<chained_t> chained;
friend std::ostream& operator<<(std::ostream& os, binary_op const& bop)
{
os << "(" << bop.lhs;
for (auto& rhs : bop.chained)
os << rhs.op << rhs.e;
return os << ")";
}
};
using program = expression;
using boost::fusion::operator<<;
} // namespace client::ast
BOOST_FUSION_ADAPT_STRUCT(client::ast::xlfunction, name, args)
BOOST_FUSION_ADAPT_STRUCT(client::ast::xlreference, colname, rownum)
BOOST_FUSION_ADAPT_STRUCT(client::ast::binary_op, lhs, chained)
BOOST_FUSION_ADAPT_STRUCT(client::ast::binary_op::chained_t, op, e)
namespace client::calculator_grammar {
struct expression_class //: x3::annotate_on_success
{
// Our error handler
template <typename Iterator, typename Exception, typename Context>
x3::error_handler_result on_error(Iterator& q, Iterator const& last,
Exception const& x,
Context const& context)
{
std::cout //
<< "Error! Expecting: " << x.which() //
<< " here: \"" << std::string(x.where(), last) //
<< "\"" << std::endl;
return x3::error_handler_result::fail;
}
};
x3::rule<struct identifier_class, ast::identifier> const identifier{"identifier"};
x3::rule<struct xlreference, ast::xlreference> const xlreference{"xlreference"};
x3::rule<struct xlfunction_class, ast::xlfunction> const xlfunction{"xlfunction"};
x3::rule<struct factor_class, ast::expression> const factor{"factor"};
x3::rule<struct expression_class, ast::binary_op> const expression{"expression"};
x3::rule<struct term_class, ast::binary_op> const term{"term"};
auto const xlreference_def =
x3::lexeme[+x3::alpha >> x3::uint_] /*>> !x3::char_('(')*/;
auto const identifier_def =
x3::raw[x3::lexeme[x3::alpha >> *(x3::alpha | '_') /*>> !x3::digit*/]];
auto const string_literal =
x3::rule<struct _, ast::string_literal>{"string_literal"} //
= x3::lexeme['"' > *('\' >> x3::char_ | ~x3::char_('"')) > '"'];
auto const factor_def = //
xlfunction //
| '(' >> expression >> ')' //
| x3::double_ //
| string_literal //
| xlreference //
| identifier //
;
auto const term_def = factor >> *(x3::char_("*/") >> factor);
auto const expression_def = term >> *(x3::char_("-+") >> term);
auto const xlfunction_def = identifier >> '(' >> -(expression % ',') >> ')';
BOOST_SPIRIT_DEFINE(xlreference)
BOOST_SPIRIT_DEFINE(identifier)
BOOST_SPIRIT_DEFINE(xlfunction)
BOOST_SPIRIT_DEFINE(term)
BOOST_SPIRIT_DEFINE(factor)
BOOST_SPIRIT_DEFINE(expression)
} // namespace client::calculator_grammar
int main() {
std::vector<int> positions; // TODO
auto parser = x3::with<struct position_cache_tag /*TODO*/> //
(std::ref(positions)) //
[ //
x3::skip(x3::space)[ //
client::calculator_grammar::expression >> x3::eoi //
] //
];
struct {
std::string category;
std::vector<std::string> cases;
} test_table[] = {
{
"xlreference",
{"A1", "A1111", "AbCdZ9876543", "i9", "i0"},
},
{
"identifier",
{"i", "id", "id_entifier"},
},
{
"number",
{"123", "inf", "-inf", "NaN", ".99e34", "1e-8", "1.e-8", "+9"},
},
{
"binaries",
{ //
"3+4", "3*4", //
"3+4+5", "3*4*5", "3+4*5", "3*4+5", "3*4+5", //
"3+(4+5)", "3*(4*5)", "3+(4*5)", "3*(4+5)", "3*(4+5)", //
"(3+4)+5", "(3*4)*5", "(3+4)*5", "(3*4)+5", "(3*4)+5"},
},
{
"xlfunction",
{
"pi()",
"sin(4)",
R"--(IIF(A1, "Red", "Green"))--",
},
},
{
"invalid",
{
"A9()", // an xlreference may not be followed by ()
"", // you didn't specify
},
},
{
"other",
{
"A-9", // 1-letter identifier and binary operation
"1 + +9", // unary plus accepted in number rule
},
},
{
"question",
{
"myfunc(myparam1, myparam2)",
"A1",
"AA234",
"A1 + sin(A2+3)",
},
},
};
for (auto& [cat, cases] : test_table) {
for (std::string const& str : cases) {
auto iter = begin(str), last(end(str));
std::cout << std::setw(12) << cat << ": ";
client::ast::program ast;
if (parse(iter, last, parser, ast)) {
std::cout << "parsed: " << ast;
} else {
std::cout << "failed: " << std::quoted(str);
}
if (iter == last) {
std::cout << "\n";
} else {
std::cout << " unparsed: "
<< std::quoted(std::string_view(iter, last)) << "\n";
}
}
}
}
版画
xlreference: parsed: (((A 1)))
xlreference: parsed: (((A 1111)))
xlreference: parsed: (((AbCdZ 9876543)))
xlreference: parsed: (((i 9)))
xlreference: parsed: (((i 0)))
identifier: parsed: ((i))
identifier: parsed: ((id))
identifier: parsed: ((id_entifier))
number: parsed: ((123))
number: parsed: ((inf))
number: parsed: ((-inf))
number: parsed: ((nan))
number: parsed: ((9.9e+33))
number: parsed: ((1e-08))
number: parsed: ((1e-08))
number: parsed: ((9))
binaries: parsed: ((3)+(4))
binaries: parsed: ((3*4))
binaries: parsed: ((3)+(4)+(5))
binaries: parsed: ((3*4*5))
binaries: parsed: ((3)+(4*5))
binaries: parsed: ((3*4)+(5))
binaries: parsed: ((3*4)+(5))
binaries: parsed: ((3)+(((4)+(5))))
binaries: parsed: ((3*((4*5))))
binaries: parsed: ((3)+(((4*5))))
binaries: parsed: ((3*((4)+(5))))
binaries: parsed: ((3*((4)+(5))))
binaries: parsed: ((((3)+(4)))+(5))
binaries: parsed: ((((3*4))*5))
binaries: parsed: ((((3)+(4))*5))
binaries: parsed: ((((3*4)))+(5))
binaries: parsed: ((((3*4)))+(5))
xlfunction: parsed: ((pi())
xlfunction: parsed: ((sin(((4))))
xlfunction: parsed: ((IIF((((A 1))), (("Red")), (("Green"))))
invalid: failed: "A9()" unparsed: "A9()"
invalid: failed: ""
other: parsed: ((A)-(9))
other: parsed: ((1)+(9))
question: parsed: ((myfunc((((myparam 1))), (((myparam 2)))))
question: parsed: (((A 1)))
question: parsed: (((AA 234)))
question: parsed: (((A 1))+(sin((((A 2))+(3))))
仅有的两行 failed
符合预期
调试?
直接取消注释
#define BOOST_SPIRIT_X3_DEBUG
并被额外的噪音猛击:
question: <expression>
<try>A1 + sin(A2+3)</try>
<term>
<try>A1 + sin(A2+3)</try>
<factor>
<try>A1 + sin(A2+3)</try>
<xlfunction>
<try>A1 + sin(A2+3)</try>
<identifier>
<try>A1 + sin(A2+3)</try>
<success>1 + sin(A2+3)</success>
<attributes>[A]</attributes>
</identifier>
<fail/>
</xlfunction>
<string_literal>
<try>A1 + sin(A2+3)</try>
<fail/>
</string_literal>
<xlreference>
<try>A1 + sin(A2+3)</try>
<success> + sin(A2+3)</success>
<attributes>[[A], 1]</attributes>
</xlreference>
<success> + sin(A2+3)</success>
<attributes>[[A], 1]</attributes>
</factor>
<success> + sin(A2+3)</success>
<attributes>[[[A], 1], []]</attributes>
</term>
<term>
<try> sin(A2+3)</try>
<factor>
<try> sin(A2+3)</try>
<xlfunction>
<try> sin(A2+3)</try>
<identifier>
<try> sin(A2+3)</try>
<success>(A2+3)</success>
<attributes>[s, i, n]</attributes>
</identifier>
<expression>
<try>A2+3)</try>
<term>
<try>A2+3)</try>
<factor>
<try>A2+3)</try>
<xlfunction>
<try>A2+3)</try>
<identifier>
<try>A2+3)</try>
<success>2+3)</success>
<attributes>[A]</attributes>
</identifier>
<fail/>
</xlfunction>
<string_literal>
<try>A2+3)</try>
<fail/>
</string_literal>
<xlreference>
<try>A2+3)</try>
<success>+3)</success>
<attributes>[[A], 2]</attributes>
</xlreference>
<success>+3)</success>
<attributes>[[A], 2]</attributes>
</factor>
<success>+3)</success>
<attributes>[[[A], 2], []]</attributes>
</term>
<term>
<try>3)</try>
<factor>
<try>3)</try>
<xlfunction>
<try>3)</try>
<identifier>
<try>3)</try>
<fail/>
</identifier>
<fail/>
</xlfunction>
<success>)</success>
<attributes>3</attributes>
</factor>
<success>)</success>
<attributes>[3, []]</attributes>
</term>
<success>)</success>
<attributes>[[[[A], 2], []], [[+, [3, []]]]]</attributes>
</expression>
<success></success>
<attributes>[[s, i, n], [[[[[A], 2], []], [[+, [3, []]]]]]]</attributes>
</xlfunction>
<success></success>
<attributes>[[s, i, n], [[[[[A], 2], []], [[+, [3, []]]]]]]</attributes>
</factor>
<success></success>
<attributes>[[[s, i, n], [[[[[A], 2], []], [[+, [3, []]]]]]], []]</attributes>
</term>
<success></success>
<attributes>[[[[A], 1], []], [[+, [[[s, i, n], [[[[[A], 2], []], [[+, [3, []]]]]]], []]]]]</attributes>
</expression>
parsed: (((A 1))+(sin((((A 2))+(3))))