Spirit-Qi:如何编写非终端解析器?
Spirit-Qi: How can I write a nonterminal parser?
我想写一个可以使用的解析器(作为 qi 扩展)
via my_parser(p1, p2, ...)
其中 p1, p2, ...
是 qi 解析器表达式。
实际上,我想实现一个 best_match
解析器,它的工作方式类似于 qi 替代方案,但 select 不是第一个匹配规则,而是 'explains' 大部分输入的规则。
给定两个规则 simple_id = +(qi::alpha)
和 complex_id = simple_id >> *(qi::string("::") > simple_id)
它将 select complex_id
输入 willy::anton
。并且这样做不会产生中间属性。这些好处需要 运行 时间来支付,因为需要先行解析。
在我看来,这种解析器构造有多个用例。 benchmark(p1, ...)
作为优化的辅助解析器,举个例子。一旦我知道该怎么做,我会提供它。
这个解析器将是一个非终端。我尝试了(努力),但我无法在 qi 中找到问题的答案。我的猜测是,C++ 机制与 c qi 紧密集成,以至于没有可以理解的入口点,至少对我而言。
引入一个运算符就很容易实现我想要的东西。我在下面附加了当前的解决方案。它按预期编译但不完整。
qi 操作员得到一个融合::vector/sequence(无论什么)现成的并对其进行操作。似乎没有图书馆产品可以解决我的问题。甚至 make_nary_composite
已经期望将 args 编译为 Elements
.
我尝试了很多,但都失败了,所以我不想让你厌烦。
我可以想象的解决方法是提供一个状态运算符 ,
,这将使 p1, ...
成为合法的 qi 表达式。然后实现 unary_parser best_match
或指令来处理该表达式。
,
将获得两种模式:获取当前(成功的)解析器消耗的输入长度,并实际解析来自阶段 1 的 selected 一个。包装器首先是 运行 逗号解析器在模式 1 中,然后在模式 2 中。它可能很难看,但可以工作。
运算符实现 p1 |= p2 |= ...
在 n > 2 的 运行 时间方面是最昂贵的。我很乐意解决这个问题。
最小(但仍然合理)的开销为 best_match(p1, ...)
。
这可能吗?
因为我对 boost::fusion 没有太多经验,所以我也在代码中添加了一些问题(关于融合的方法)。
将实际上是 nary 非终端解析器的东西强加到 unary 解析器方案中,感觉不对。但是由于我的理解不足,这似乎是完成它的唯一方法。如果能帮助我摆脱困境,我将不胜感激。
我的环境:Boost 1.61,MSVC 2015 Upd 2,目标 win32console。
进度:
我想,我正在缓慢但肯定地得到它。我很高兴看到 error_invalid_expression
。由于以下原型表达式,编译失败:
boost::proto::exprns_::expr<
boost::proto::tagns_::tag::function,boost::proto::argsns_::list5<
const boost::proto::exprns_::expr<
boost::proto::tagns_::tag::terminal,boost::proto::argsns_::term<
mxc::qitoo::tag::best_match
>
,0> &
,const boost::spirit::qi::rule<
iterator_type,std::string (void),
boost::spirit::standard::space_type,
boost::spirit::unused_type,boost::spirit::unused_type> &
,const boost::spirit::qi::rule<
iterator_type,std::string (void),
boost::spirit::standard::space_type,
boost::spirit::unused_type,
boost::spirit::unused_type> &
, const boost::spirit::qi::rule<
iterator_type,std::string (void),
boost::spirit::standard::space_type,
boost::spirit::unused_type,
boost::spirit::unused_type> &
,const boost::spirit::qi::rule<
iterator_type,std::string (void),
boost::spirit::standard::space_type,
boost::spirit::unused_type,
boost::spirit::unused_type> &
>
,5>
这实际上是描述我的 best_match 解析器的函数式用法的表达式。我的测试规则看起来像
qitoo::best_match(id, qualified_id, id, qualified_id)
其中 id
和 qualified_id
是上面提到的规则。我想要的一切。
发生错误是因为此表达式没有成员 parse
。好的。深入挖掘,我发现 qi 的元编译器确实支持一元、二进制和 nary(即可变参数)解析器。但它不支持函数式语法。 Qi 似乎只对运算符使用一元、二元和三元。而且虽然支持 nary 类型,但它或多或少已经过时了,因为 qi operators are binary max.
结束进度编辑
#include <boost/spirit/home/qi.hpp>
namespace boost {
namespace spirit
{
///////////////////////////////////////////////////////////////////////////
// Enablers
///////////////////////////////////////////////////////////////////////////
template <>
struct use_operator<qi::domain, proto::tag::bitwise_or_assign> // enables |=
: mpl::true_ {};
template <>
struct flatten_tree<qi::domain, proto::tag::bitwise_or_assign> // flattens |=
: mpl::true_ {};
}
}
namespace mxc {
namespace qitoo {
namespace spirit = boost::spirit;
namespace qi = spirit::qi;
namespace fusion = boost::fusion;
template <typename Elements>
struct best_match_parser : qi::nary_parser<mxc::qitoo::best_match_parser<Elements>>
{
// This one does a lookahead parse of each qi expression to find out which rule matches
// most of the input
template <typename Iterator, typename Context, typename Skipper>
struct best_function
{
best_function(Iterator& first, Iterator const& last, Context& context,
Skipper const& skipper, int& best, int& idx, int& size)
: first(first), last(last), context(context), skipper(skipper), best(best),
idx(idx), size(size) {};
template <typename Component>
void operator()(Component& component) const
{
Iterator f = first;
if (component.parse(f, last, context, skipper, qi::unused)) {
// please have a look on how I transport best, idx and size
// into the parser context
//
// I guess, this is a nasty hack and could be done better
// Any advice would be highliy appreciated
int l = f - first;
if (l > size) {
size = l;
best = idx++;
}
idx++;
}
}
private:
int& best;
int& idx;
int& size;
Iterator& first;
Iterator const& last;
Context& context;
Skipper const& skipper;
};
template <typename Context, typename Iterator>
struct attribute
{
// Put all the element attributes in a tuple
typedef typename spirit::traits::build_attribute_sequence <
Elements, Context, spirit::traits::alternative_attribute_transform
, Iterator, qi::domain
>::type all_attributes;
// Ok, now make a variant over the attribute sequence. Note that
// build_variant makes sure that 1) all attributes in the variant
// are unique 2) puts the unused attribute, if there is any, to
// the front and 3) collapses single element variants, variant<T>
// to T.
typedef typename
spirit::traits::build_variant<all_attributes>::type
type;
};
best_match_parser(Elements const& elements_) : elements(elements_), size(0), idx(0), best(-1) {}
template <typename Iterator, typename Context, typename Skipper, typename Attribute>
bool parse(Iterator& first, Iterator const& last
, Context& context, Skipper const& skipper
, Attribute& attr_) const
{
best_function<Iterator, Context, Skipper> f(first, last, context, skipper, best, idx, size);
// find out which parser matches most of the input
fusion::for_each(elements, f);
// best >= 0 if at least one parser was successful
if (best >= 0) {
// now that I have the best parser identified, how do I access it?
// I understand that the next line can not work, but I'm looking for something like that
// --> auto v = fusion::at<boost::mpl::int_<best>>(elements);
};
return false;
}
template <typename Context>
qi::info what(Context& context) const
{
qi::info result("best_match");
fusion::for_each(elements,
spirit::detail::what_function<Context>(result, context));
return result;
}
Elements elements;
mutable int best;
mutable int idx;
mutable int size;
};
}
}
namespace boost {
namespace spirit {
namespace qi {
///////////////////////////////////////////////////////////////////////////
// Parser generators: make_xxx function (objects)
///////////////////////////////////////////////////////////////////////////
template <typename Elements, typename Modifiers>
struct make_composite<proto::tag::bitwise_or_assign, Elements, Modifiers>
: make_nary_composite < Elements, mxc::qitoo::best_match_parser >
{};
}
namespace traits {
///////////////////////////////////////////////////////////////////////////
template <typename Elements>
struct has_semantic_action<mxc::qitoo::best_match_parser<Elements> >
: nary_has_semantic_action<Elements> {};
///////////////////////////////////////////////////////////////////////////
template <typename Elements, typename Attribute, typename Context
, typename Iterator>
struct handles_container<mxc::qitoo::best_match_parser<Elements>, Attribute, Context
, Iterator>
: nary_handles_container<Elements, Attribute, Context, Iterator> {};
}
}
}
namespace qi = boost::spirit::qi;
namespace qitoo = mxc::qitoo;
using iterator_type = std::string::const_iterator;
using result_type = std::string;
template<typename Parser>
void parse(const std::string message, const std::string& input, const std::string& rule, const Parser& parser) {
iterator_type iter = input.begin(), end = input.end();
std::vector<result_type> parsed_result;
std::cout << "-------------------------\n";
std::cout << message << "\n";
std::cout << "Rule: " << rule << std::endl;
std::cout << "Parsing: \"" << input << "\"\n";
bool result = qi::phrase_parse(iter, end, parser, qi::space, parsed_result);
if (result)
{
std::cout << "Parser succeeded.\n";
std::cout << "Parsed " << parsed_result.size() << " elements:";
for (const auto& str : parsed_result)
std::cout << "[" << str << "]";
std::cout << std::endl;
}
else
{
std::cout << "Parser failed" << std::endl;
}
if (iter == end) {
std::cout << "EOI reached." << std::endl;
}
else {
std::cout << "EOI not reached. Unparsed: \"" << std::string(iter, end) << "\"" << std::endl;
}
std::cout << "-------------------------\n";
}
int main()
{
namespace qi = boost::spirit::qi;
qi::rule < iterator_type, std::string(), qi::space_type>
id = (qi::alpha | qi::char_('_')) >> *(qi::alnum | qi::char_('_'));
qi::rule < iterator_type, std::string(), qi::space_type>
qualified_id = id >> *(qi::string("::") > id);
namespace qitoo = mxc::qitoo;
namespace qi = boost::spirit::qi;
parse("best match operator, select second rule"
, "willy::anton::lutz"
, "id |= qualified_id"
, id |= qualified_id);
}
你的例子
我看不出您的样本如何需要这些。只需重新排序您的分支,然后意识到短分支只是 n=1 的合格案例的特例:Live On Coliru¹ (or using X3 version(如果您愿意)。
一般情况
现在,提到 x3,它有能力让您的生活更轻松!
在一般情况下,这是我认为您想要的:
namespace parser {
template <typename... Parsers>
struct longest_parser : x3::parser_base {
longest_parser(Parsers... sub) : _alternatives {sub...} { }
template <typename It, typename Ctx, typename Other, typename Attr>
bool parse(It& f, It l, Ctx& ctx, Other const& other, Attr& attr) const {
auto const saved = f;
//// To exclude pre-skip from length comparisons, do pre-skip here:
// x3::skip_over(f, l, ctx);
auto seq = std::make_index_sequence<sizeof...(Parsers)>();
auto best = select_best(f, l, ctx, seq);
//std::cout << "Longest match at index #" << best << "\n";
bool ok = dispatch(f, l, ctx, other, attr, best, seq);
if (!ok)
f = saved;
return ok;
}
private:
template <typename It, typename Ctx, typename P>
size_t length_of(It f, It l, Ctx ctx, P const& p) const {
boost::iterator_range<It> matched;
return x3::raw[p].parse(f, l, ctx, x3::unused, matched)? boost::size(matched) : 0;
}
template <typename It, typename Ctx, size_t... I>
size_t select_best(It f, It l, Ctx& ctx, std::index_sequence<I...>) const {
std::array<size_t, sizeof...(I)> lengths { length_of(f, l, ctx, std::get<I>(_alternatives))... };
return std::distance(lengths.begin(), std::max_element(lengths.begin(), lengths.end()));
}
template <typename It, typename Ctx, typename Other, typename Attr, size_t... I>
bool dispatch(It& f, It l, Ctx& ctx, Other const& other, Attr& attr, size_t targetIdx, std::index_sequence<I...>) const {
//return (real_parse<I>(f, l, ctx, other, attr, targetIdx) || ...);
std::array<bool, sizeof...(I)> b = { real_parse<I>(f, l, ctx, other, attr, targetIdx)... };
return std::accumulate(b.begin(), b.end(), false, std::logical_or<bool>());
}
template <size_t Idx, typename It, typename Ctx, typename Other, typename Attr>
bool real_parse(It& f, It l, Ctx& ctx, Other const& other, Attr& attr, size_t targetIdx) const {
if (targetIdx != Idx)
return false;
return std::get<Idx>(_alternatives).parse(f, l, ctx, other, attr);
}
std::tuple<Parsers...> _alternatives;
};
template <typename... Ps>
longest_parser<Ps...> longest(Ps... p) { return {x3::as_parser(p)...}; }
}
如果您的编译器支持,请注意您可以在 dispatch
中使用的折叠表达式(Coliru 支持,edit it to see!)。
Note also the subtle choice regarding skippable (probably whitespace); if it's not significant for the length comparisons, uncomment the pre-skip.
现场演示
#include <boost/spirit/home/x3.hpp>
#include <type_traits>
#include <iostream>
#include <numeric>
namespace x3 = boost::spirit::x3;
namespace std {
template <typename T> // just for easy debug printing; hack
static std::ostream& operator<<(std::ostream& os, std::vector<T> const& v) {
for (auto& el : v) std::cout << '[' << el << ']';
return os;
}
}
using string_vec = std::vector<std::string>;
using result_type = boost::variant<std::string, double, string_vec>;
template <typename Parser>
void parse(const std::string message, const std::string &input, const std::string &rule, const Parser &parser) {
auto iter = input.begin(), end = input.end();
std::cout << "-------------------------\n";
std::cout << message << "\n";
std::cout << "Rule: " << rule << "\n";
std::cout << "Parsing: '" << input << "'\n";
result_type parsed_result;
bool result = phrase_parse(iter, end, parser, x3::space, parsed_result);
if (result) {
std::cout << "Parsed " << parsed_result << "\n";
} else {
std::cout << "Parser failed\n";
}
if (iter != end)
std::cout << "EOI not reached. Unparsed: '" << std::string(iter, end) << "'\n";
}
namespace parser {
template <typename... Parsers>
struct longest_parser : x3::parser_base {
longest_parser(Parsers... sub) : _alternatives {sub...} { }
template <typename It, typename Ctx, typename Other, typename Attr>
bool parse(It& f, It l, Ctx& ctx, Other const& other, Attr& attr) const {
auto const saved = f;
//// To exclude pre-skip from length comparisons, do pre-skip here:
// x3::skip_over(f, l, ctx);
auto seq = std::make_index_sequence<sizeof...(Parsers)>();
auto best = select_best(f, l, ctx, seq);
//std::cout << "Longest match at index #" << best << "\n";
bool ok = dispatch(f, l, ctx, other, attr, best, seq);
if (!ok)
f = saved;
return ok;
}
private:
template <typename It, typename Ctx, typename P>
size_t length_of(It f, It l, Ctx ctx, P const& p) const {
boost::iterator_range<It> matched;
return x3::raw[p].parse(f, l, ctx, x3::unused, matched)? boost::size(matched) : 0;
}
template <typename It, typename Ctx, size_t... I>
size_t select_best(It f, It l, Ctx& ctx, std::index_sequence<I...>) const {
std::array<size_t, sizeof...(I)> lengths { length_of(f, l, ctx, std::get<I>(_alternatives))... };
return std::distance(lengths.begin(), std::max_element(lengths.begin(), lengths.end()));
}
template <typename It, typename Ctx, typename Other, typename Attr, size_t... I>
bool dispatch(It& f, It l, Ctx& ctx, Other const& other, Attr& attr, size_t targetIdx, std::index_sequence<I...>) const {
//return (real_parse<I>(f, l, ctx, other, attr, targetIdx) || ...);
std::array<bool, sizeof...(I)> b = { real_parse<I>(f, l, ctx, other, attr, targetIdx)... };
return std::accumulate(b.begin(), b.end(), false, std::logical_or<bool>());
}
template <size_t Idx, typename It, typename Ctx, typename Other, typename Attr>
bool real_parse(It& f, It l, Ctx& ctx, Other const& other, Attr& attr, size_t targetIdx) const {
if (targetIdx != Idx)
return false;
return std::get<Idx>(_alternatives).parse(f, l, ctx, other, attr);
}
std::tuple<Parsers...> _alternatives;
};
template <typename... Ps>
longest_parser<Ps...> longest(Ps... p) { return {x3::as_parser(p)...}; }
}
int main() {
auto id = x3::rule<void, std::string> {} = x3::lexeme [ x3::char_("a-zA-Z_") >> *x3::char_("a-zA-Z0-9_") ];
auto qualified = x3::rule<void, string_vec> {} = id % "::";
#define TEST_CASE(label, input, rule) parse(label, input, #rule, rule)
TEST_CASE("unqualified" , "willy" , parser::longest(id, x3::int_, x3::double_));
TEST_CASE("unqualified with whitespace", " willy \t" , parser::longest(id, x3::int_, x3::double_));
TEST_CASE("integral or number" , "123.78::anton::lutz" , parser::longest(id, x3::int_, x3::double_));
TEST_CASE("qualified" , "willy anton::lutz" , parser::longest(id, x3::int_, x3::double_));
TEST_CASE("qualified with whitespace" , "willy \tanton::lutz" , parser::longest(id, x3::int_, x3::double_));
TEST_CASE("unqualified" , "willy" , parser::longest(id, x3::int_, x3::double_, qualified));
TEST_CASE("unqualified with whitespace", " willy \t" , parser::longest(id, x3::int_, x3::double_, qualified));
TEST_CASE("integral or number" , "123.78::anton::lutz" , parser::longest(id, x3::int_, x3::double_, qualified));
TEST_CASE("qualified" , "willy::anton::lutz" , parser::longest(id, x3::int_, x3::double_, qualified));
TEST_CASE("qualified with whitespace" , "willy ::\tanton::lutz", parser::longest(id, x3::int_, x3::double_, qualified));
TEST_CASE("unqualified" , "willy" , parser::longest(x3::int_, x3::double_, qualified));
TEST_CASE("unqualified with whitespace", " willy \t" , parser::longest(x3::int_, x3::double_, qualified));
TEST_CASE("integral or number" , "123.78::anton::lutz" , parser::longest(x3::int_, x3::double_, qualified));
TEST_CASE("qualified" , "willy::anton::lutz" , parser::longest(x3::int_, x3::double_, qualified));
TEST_CASE("qualified with whitespace" , "willy ::\tanton::lutz", parser::longest(x3::int_, x3::double_, qualified));
}
版画
-------------------------
unqualified
Rule: parser::longest(id, x3::int_, x3::double_)
Parsing: 'willy'
Parsed willy
-------------------------
unqualified with whitespace
Rule: parser::longest(id, x3::int_, x3::double_)
Parsing: ' willy '
Parsed willy
-------------------------
integral or number
Rule: parser::longest(id, x3::int_, x3::double_)
Parsing: '123.78::anton::lutz'
Parsed 123.78
EOI not reached. Unparsed: '::anton::lutz'
-------------------------
qualified
Rule: parser::longest(id, x3::int_, x3::double_)
Parsing: 'willy anton::lutz'
Parsed willy
EOI not reached. Unparsed: 'anton::lutz'
-------------------------
qualified with whitespace
Rule: parser::longest(id, x3::int_, x3::double_)
Parsing: 'willy anton::lutz'
Parsed willy
EOI not reached. Unparsed: 'anton::lutz'
-------------------------
unqualified
Rule: parser::longest(id, x3::int_, x3::double_, qualified)
Parsing: 'willy'
Parsed willy
-------------------------
unqualified with whitespace
Rule: parser::longest(id, x3::int_, x3::double_, qualified)
Parsing: ' willy '
Parsed willy
-------------------------
integral or number
Rule: parser::longest(id, x3::int_, x3::double_, qualified)
Parsing: '123.78::anton::lutz'
Parsed 123.78
EOI not reached. Unparsed: '::anton::lutz'
-------------------------
qualified
Rule: parser::longest(id, x3::int_, x3::double_, qualified)
Parsing: 'willy::anton::lutz'
Parsed [willy][anton][lutz]
-------------------------
qualified with whitespace
Rule: parser::longest(id, x3::int_, x3::double_, qualified)
Parsing: 'willy :: anton::lutz'
Parsed [willy][anton][lutz]
-------------------------
unqualified
Rule: parser::longest(x3::int_, x3::double_, qualified)
Parsing: 'willy'
Parsed [willy]
-------------------------
unqualified with whitespace
Rule: parser::longest(x3::int_, x3::double_, qualified)
Parsing: ' willy '
Parsed [willy]
-------------------------
integral or number
Rule: parser::longest(x3::int_, x3::double_, qualified)
Parsing: '123.78::anton::lutz'
Parsed 123.78
EOI not reached. Unparsed: '::anton::lutz'
-------------------------
qualified
Rule: parser::longest(x3::int_, x3::double_, qualified)
Parsing: 'willy::anton::lutz'
Parsed [willy][anton][lutz]
-------------------------
qualified with whitespace
Rule: parser::longest(x3::int_, x3::double_, qualified)
Parsing: 'willy :: anton::lutz'
Parsed [willy][anton][lutz]
请注意不同的结果取决于备选方案中的解析器表达式。
我想写一个可以使用的解析器(作为 qi 扩展)
via my_parser(p1, p2, ...)
其中 p1, p2, ...
是 qi 解析器表达式。
实际上,我想实现一个 best_match
解析器,它的工作方式类似于 qi 替代方案,但 select 不是第一个匹配规则,而是 'explains' 大部分输入的规则。
给定两个规则 simple_id = +(qi::alpha)
和 complex_id = simple_id >> *(qi::string("::") > simple_id)
它将 select complex_id
输入 willy::anton
。并且这样做不会产生中间属性。这些好处需要 运行 时间来支付,因为需要先行解析。
在我看来,这种解析器构造有多个用例。 benchmark(p1, ...)
作为优化的辅助解析器,举个例子。一旦我知道该怎么做,我会提供它。
这个解析器将是一个非终端。我尝试了(努力),但我无法在 qi 中找到问题的答案。我的猜测是,C++ 机制与 c qi 紧密集成,以至于没有可以理解的入口点,至少对我而言。
引入一个运算符就很容易实现我想要的东西。我在下面附加了当前的解决方案。它按预期编译但不完整。
qi 操作员得到一个融合::vector/sequence(无论什么)现成的并对其进行操作。似乎没有图书馆产品可以解决我的问题。甚至 make_nary_composite
已经期望将 args 编译为 Elements
.
我尝试了很多,但都失败了,所以我不想让你厌烦。
我可以想象的解决方法是提供一个状态运算符 ,
,这将使 p1, ...
成为合法的 qi 表达式。然后实现 unary_parser best_match
或指令来处理该表达式。
,
将获得两种模式:获取当前(成功的)解析器消耗的输入长度,并实际解析来自阶段 1 的 selected 一个。包装器首先是 运行 逗号解析器在模式 1 中,然后在模式 2 中。它可能很难看,但可以工作。
运算符实现 p1 |= p2 |= ...
在 n > 2 的 运行 时间方面是最昂贵的。我很乐意解决这个问题。
最小(但仍然合理)的开销为 best_match(p1, ...)
。
这可能吗?
因为我对 boost::fusion 没有太多经验,所以我也在代码中添加了一些问题(关于融合的方法)。
将实际上是 nary 非终端解析器的东西强加到 unary 解析器方案中,感觉不对。但是由于我的理解不足,这似乎是完成它的唯一方法。如果能帮助我摆脱困境,我将不胜感激。
我的环境:Boost 1.61,MSVC 2015 Upd 2,目标 win32console。
进度:
我想,我正在缓慢但肯定地得到它。我很高兴看到 error_invalid_expression
。由于以下原型表达式,编译失败:
boost::proto::exprns_::expr<
boost::proto::tagns_::tag::function,boost::proto::argsns_::list5<
const boost::proto::exprns_::expr<
boost::proto::tagns_::tag::terminal,boost::proto::argsns_::term<
mxc::qitoo::tag::best_match
>
,0> &
,const boost::spirit::qi::rule<
iterator_type,std::string (void),
boost::spirit::standard::space_type,
boost::spirit::unused_type,boost::spirit::unused_type> &
,const boost::spirit::qi::rule<
iterator_type,std::string (void),
boost::spirit::standard::space_type,
boost::spirit::unused_type,
boost::spirit::unused_type> &
, const boost::spirit::qi::rule<
iterator_type,std::string (void),
boost::spirit::standard::space_type,
boost::spirit::unused_type,
boost::spirit::unused_type> &
,const boost::spirit::qi::rule<
iterator_type,std::string (void),
boost::spirit::standard::space_type,
boost::spirit::unused_type,
boost::spirit::unused_type> &
>
,5>
这实际上是描述我的 best_match 解析器的函数式用法的表达式。我的测试规则看起来像
qitoo::best_match(id, qualified_id, id, qualified_id)
其中 id
和 qualified_id
是上面提到的规则。我想要的一切。
发生错误是因为此表达式没有成员 parse
。好的。深入挖掘,我发现 qi 的元编译器确实支持一元、二进制和 nary(即可变参数)解析器。但它不支持函数式语法。 Qi 似乎只对运算符使用一元、二元和三元。而且虽然支持 nary 类型,但它或多或少已经过时了,因为 qi operators are binary max.
结束进度编辑
#include <boost/spirit/home/qi.hpp>
namespace boost {
namespace spirit
{
///////////////////////////////////////////////////////////////////////////
// Enablers
///////////////////////////////////////////////////////////////////////////
template <>
struct use_operator<qi::domain, proto::tag::bitwise_or_assign> // enables |=
: mpl::true_ {};
template <>
struct flatten_tree<qi::domain, proto::tag::bitwise_or_assign> // flattens |=
: mpl::true_ {};
}
}
namespace mxc {
namespace qitoo {
namespace spirit = boost::spirit;
namespace qi = spirit::qi;
namespace fusion = boost::fusion;
template <typename Elements>
struct best_match_parser : qi::nary_parser<mxc::qitoo::best_match_parser<Elements>>
{
// This one does a lookahead parse of each qi expression to find out which rule matches
// most of the input
template <typename Iterator, typename Context, typename Skipper>
struct best_function
{
best_function(Iterator& first, Iterator const& last, Context& context,
Skipper const& skipper, int& best, int& idx, int& size)
: first(first), last(last), context(context), skipper(skipper), best(best),
idx(idx), size(size) {};
template <typename Component>
void operator()(Component& component) const
{
Iterator f = first;
if (component.parse(f, last, context, skipper, qi::unused)) {
// please have a look on how I transport best, idx and size
// into the parser context
//
// I guess, this is a nasty hack and could be done better
// Any advice would be highliy appreciated
int l = f - first;
if (l > size) {
size = l;
best = idx++;
}
idx++;
}
}
private:
int& best;
int& idx;
int& size;
Iterator& first;
Iterator const& last;
Context& context;
Skipper const& skipper;
};
template <typename Context, typename Iterator>
struct attribute
{
// Put all the element attributes in a tuple
typedef typename spirit::traits::build_attribute_sequence <
Elements, Context, spirit::traits::alternative_attribute_transform
, Iterator, qi::domain
>::type all_attributes;
// Ok, now make a variant over the attribute sequence. Note that
// build_variant makes sure that 1) all attributes in the variant
// are unique 2) puts the unused attribute, if there is any, to
// the front and 3) collapses single element variants, variant<T>
// to T.
typedef typename
spirit::traits::build_variant<all_attributes>::type
type;
};
best_match_parser(Elements const& elements_) : elements(elements_), size(0), idx(0), best(-1) {}
template <typename Iterator, typename Context, typename Skipper, typename Attribute>
bool parse(Iterator& first, Iterator const& last
, Context& context, Skipper const& skipper
, Attribute& attr_) const
{
best_function<Iterator, Context, Skipper> f(first, last, context, skipper, best, idx, size);
// find out which parser matches most of the input
fusion::for_each(elements, f);
// best >= 0 if at least one parser was successful
if (best >= 0) {
// now that I have the best parser identified, how do I access it?
// I understand that the next line can not work, but I'm looking for something like that
// --> auto v = fusion::at<boost::mpl::int_<best>>(elements);
};
return false;
}
template <typename Context>
qi::info what(Context& context) const
{
qi::info result("best_match");
fusion::for_each(elements,
spirit::detail::what_function<Context>(result, context));
return result;
}
Elements elements;
mutable int best;
mutable int idx;
mutable int size;
};
}
}
namespace boost {
namespace spirit {
namespace qi {
///////////////////////////////////////////////////////////////////////////
// Parser generators: make_xxx function (objects)
///////////////////////////////////////////////////////////////////////////
template <typename Elements, typename Modifiers>
struct make_composite<proto::tag::bitwise_or_assign, Elements, Modifiers>
: make_nary_composite < Elements, mxc::qitoo::best_match_parser >
{};
}
namespace traits {
///////////////////////////////////////////////////////////////////////////
template <typename Elements>
struct has_semantic_action<mxc::qitoo::best_match_parser<Elements> >
: nary_has_semantic_action<Elements> {};
///////////////////////////////////////////////////////////////////////////
template <typename Elements, typename Attribute, typename Context
, typename Iterator>
struct handles_container<mxc::qitoo::best_match_parser<Elements>, Attribute, Context
, Iterator>
: nary_handles_container<Elements, Attribute, Context, Iterator> {};
}
}
}
namespace qi = boost::spirit::qi;
namespace qitoo = mxc::qitoo;
using iterator_type = std::string::const_iterator;
using result_type = std::string;
template<typename Parser>
void parse(const std::string message, const std::string& input, const std::string& rule, const Parser& parser) {
iterator_type iter = input.begin(), end = input.end();
std::vector<result_type> parsed_result;
std::cout << "-------------------------\n";
std::cout << message << "\n";
std::cout << "Rule: " << rule << std::endl;
std::cout << "Parsing: \"" << input << "\"\n";
bool result = qi::phrase_parse(iter, end, parser, qi::space, parsed_result);
if (result)
{
std::cout << "Parser succeeded.\n";
std::cout << "Parsed " << parsed_result.size() << " elements:";
for (const auto& str : parsed_result)
std::cout << "[" << str << "]";
std::cout << std::endl;
}
else
{
std::cout << "Parser failed" << std::endl;
}
if (iter == end) {
std::cout << "EOI reached." << std::endl;
}
else {
std::cout << "EOI not reached. Unparsed: \"" << std::string(iter, end) << "\"" << std::endl;
}
std::cout << "-------------------------\n";
}
int main()
{
namespace qi = boost::spirit::qi;
qi::rule < iterator_type, std::string(), qi::space_type>
id = (qi::alpha | qi::char_('_')) >> *(qi::alnum | qi::char_('_'));
qi::rule < iterator_type, std::string(), qi::space_type>
qualified_id = id >> *(qi::string("::") > id);
namespace qitoo = mxc::qitoo;
namespace qi = boost::spirit::qi;
parse("best match operator, select second rule"
, "willy::anton::lutz"
, "id |= qualified_id"
, id |= qualified_id);
}
你的例子
我看不出您的样本如何需要这些。只需重新排序您的分支,然后意识到短分支只是 n=1 的合格案例的特例:Live On Coliru¹ (or using X3 version(如果您愿意)。
一般情况
现在,提到 x3,它有能力让您的生活更轻松!
在一般情况下,这是我认为您想要的:
namespace parser {
template <typename... Parsers>
struct longest_parser : x3::parser_base {
longest_parser(Parsers... sub) : _alternatives {sub...} { }
template <typename It, typename Ctx, typename Other, typename Attr>
bool parse(It& f, It l, Ctx& ctx, Other const& other, Attr& attr) const {
auto const saved = f;
//// To exclude pre-skip from length comparisons, do pre-skip here:
// x3::skip_over(f, l, ctx);
auto seq = std::make_index_sequence<sizeof...(Parsers)>();
auto best = select_best(f, l, ctx, seq);
//std::cout << "Longest match at index #" << best << "\n";
bool ok = dispatch(f, l, ctx, other, attr, best, seq);
if (!ok)
f = saved;
return ok;
}
private:
template <typename It, typename Ctx, typename P>
size_t length_of(It f, It l, Ctx ctx, P const& p) const {
boost::iterator_range<It> matched;
return x3::raw[p].parse(f, l, ctx, x3::unused, matched)? boost::size(matched) : 0;
}
template <typename It, typename Ctx, size_t... I>
size_t select_best(It f, It l, Ctx& ctx, std::index_sequence<I...>) const {
std::array<size_t, sizeof...(I)> lengths { length_of(f, l, ctx, std::get<I>(_alternatives))... };
return std::distance(lengths.begin(), std::max_element(lengths.begin(), lengths.end()));
}
template <typename It, typename Ctx, typename Other, typename Attr, size_t... I>
bool dispatch(It& f, It l, Ctx& ctx, Other const& other, Attr& attr, size_t targetIdx, std::index_sequence<I...>) const {
//return (real_parse<I>(f, l, ctx, other, attr, targetIdx) || ...);
std::array<bool, sizeof...(I)> b = { real_parse<I>(f, l, ctx, other, attr, targetIdx)... };
return std::accumulate(b.begin(), b.end(), false, std::logical_or<bool>());
}
template <size_t Idx, typename It, typename Ctx, typename Other, typename Attr>
bool real_parse(It& f, It l, Ctx& ctx, Other const& other, Attr& attr, size_t targetIdx) const {
if (targetIdx != Idx)
return false;
return std::get<Idx>(_alternatives).parse(f, l, ctx, other, attr);
}
std::tuple<Parsers...> _alternatives;
};
template <typename... Ps>
longest_parser<Ps...> longest(Ps... p) { return {x3::as_parser(p)...}; }
}
如果您的编译器支持,请注意您可以在 dispatch
中使用的折叠表达式(Coliru 支持,edit it to see!)。
Note also the subtle choice regarding skippable (probably whitespace); if it's not significant for the length comparisons, uncomment the pre-skip.
现场演示
#include <boost/spirit/home/x3.hpp>
#include <type_traits>
#include <iostream>
#include <numeric>
namespace x3 = boost::spirit::x3;
namespace std {
template <typename T> // just for easy debug printing; hack
static std::ostream& operator<<(std::ostream& os, std::vector<T> const& v) {
for (auto& el : v) std::cout << '[' << el << ']';
return os;
}
}
using string_vec = std::vector<std::string>;
using result_type = boost::variant<std::string, double, string_vec>;
template <typename Parser>
void parse(const std::string message, const std::string &input, const std::string &rule, const Parser &parser) {
auto iter = input.begin(), end = input.end();
std::cout << "-------------------------\n";
std::cout << message << "\n";
std::cout << "Rule: " << rule << "\n";
std::cout << "Parsing: '" << input << "'\n";
result_type parsed_result;
bool result = phrase_parse(iter, end, parser, x3::space, parsed_result);
if (result) {
std::cout << "Parsed " << parsed_result << "\n";
} else {
std::cout << "Parser failed\n";
}
if (iter != end)
std::cout << "EOI not reached. Unparsed: '" << std::string(iter, end) << "'\n";
}
namespace parser {
template <typename... Parsers>
struct longest_parser : x3::parser_base {
longest_parser(Parsers... sub) : _alternatives {sub...} { }
template <typename It, typename Ctx, typename Other, typename Attr>
bool parse(It& f, It l, Ctx& ctx, Other const& other, Attr& attr) const {
auto const saved = f;
//// To exclude pre-skip from length comparisons, do pre-skip here:
// x3::skip_over(f, l, ctx);
auto seq = std::make_index_sequence<sizeof...(Parsers)>();
auto best = select_best(f, l, ctx, seq);
//std::cout << "Longest match at index #" << best << "\n";
bool ok = dispatch(f, l, ctx, other, attr, best, seq);
if (!ok)
f = saved;
return ok;
}
private:
template <typename It, typename Ctx, typename P>
size_t length_of(It f, It l, Ctx ctx, P const& p) const {
boost::iterator_range<It> matched;
return x3::raw[p].parse(f, l, ctx, x3::unused, matched)? boost::size(matched) : 0;
}
template <typename It, typename Ctx, size_t... I>
size_t select_best(It f, It l, Ctx& ctx, std::index_sequence<I...>) const {
std::array<size_t, sizeof...(I)> lengths { length_of(f, l, ctx, std::get<I>(_alternatives))... };
return std::distance(lengths.begin(), std::max_element(lengths.begin(), lengths.end()));
}
template <typename It, typename Ctx, typename Other, typename Attr, size_t... I>
bool dispatch(It& f, It l, Ctx& ctx, Other const& other, Attr& attr, size_t targetIdx, std::index_sequence<I...>) const {
//return (real_parse<I>(f, l, ctx, other, attr, targetIdx) || ...);
std::array<bool, sizeof...(I)> b = { real_parse<I>(f, l, ctx, other, attr, targetIdx)... };
return std::accumulate(b.begin(), b.end(), false, std::logical_or<bool>());
}
template <size_t Idx, typename It, typename Ctx, typename Other, typename Attr>
bool real_parse(It& f, It l, Ctx& ctx, Other const& other, Attr& attr, size_t targetIdx) const {
if (targetIdx != Idx)
return false;
return std::get<Idx>(_alternatives).parse(f, l, ctx, other, attr);
}
std::tuple<Parsers...> _alternatives;
};
template <typename... Ps>
longest_parser<Ps...> longest(Ps... p) { return {x3::as_parser(p)...}; }
}
int main() {
auto id = x3::rule<void, std::string> {} = x3::lexeme [ x3::char_("a-zA-Z_") >> *x3::char_("a-zA-Z0-9_") ];
auto qualified = x3::rule<void, string_vec> {} = id % "::";
#define TEST_CASE(label, input, rule) parse(label, input, #rule, rule)
TEST_CASE("unqualified" , "willy" , parser::longest(id, x3::int_, x3::double_));
TEST_CASE("unqualified with whitespace", " willy \t" , parser::longest(id, x3::int_, x3::double_));
TEST_CASE("integral or number" , "123.78::anton::lutz" , parser::longest(id, x3::int_, x3::double_));
TEST_CASE("qualified" , "willy anton::lutz" , parser::longest(id, x3::int_, x3::double_));
TEST_CASE("qualified with whitespace" , "willy \tanton::lutz" , parser::longest(id, x3::int_, x3::double_));
TEST_CASE("unqualified" , "willy" , parser::longest(id, x3::int_, x3::double_, qualified));
TEST_CASE("unqualified with whitespace", " willy \t" , parser::longest(id, x3::int_, x3::double_, qualified));
TEST_CASE("integral or number" , "123.78::anton::lutz" , parser::longest(id, x3::int_, x3::double_, qualified));
TEST_CASE("qualified" , "willy::anton::lutz" , parser::longest(id, x3::int_, x3::double_, qualified));
TEST_CASE("qualified with whitespace" , "willy ::\tanton::lutz", parser::longest(id, x3::int_, x3::double_, qualified));
TEST_CASE("unqualified" , "willy" , parser::longest(x3::int_, x3::double_, qualified));
TEST_CASE("unqualified with whitespace", " willy \t" , parser::longest(x3::int_, x3::double_, qualified));
TEST_CASE("integral or number" , "123.78::anton::lutz" , parser::longest(x3::int_, x3::double_, qualified));
TEST_CASE("qualified" , "willy::anton::lutz" , parser::longest(x3::int_, x3::double_, qualified));
TEST_CASE("qualified with whitespace" , "willy ::\tanton::lutz", parser::longest(x3::int_, x3::double_, qualified));
}
版画
-------------------------
unqualified
Rule: parser::longest(id, x3::int_, x3::double_)
Parsing: 'willy'
Parsed willy
-------------------------
unqualified with whitespace
Rule: parser::longest(id, x3::int_, x3::double_)
Parsing: ' willy '
Parsed willy
-------------------------
integral or number
Rule: parser::longest(id, x3::int_, x3::double_)
Parsing: '123.78::anton::lutz'
Parsed 123.78
EOI not reached. Unparsed: '::anton::lutz'
-------------------------
qualified
Rule: parser::longest(id, x3::int_, x3::double_)
Parsing: 'willy anton::lutz'
Parsed willy
EOI not reached. Unparsed: 'anton::lutz'
-------------------------
qualified with whitespace
Rule: parser::longest(id, x3::int_, x3::double_)
Parsing: 'willy anton::lutz'
Parsed willy
EOI not reached. Unparsed: 'anton::lutz'
-------------------------
unqualified
Rule: parser::longest(id, x3::int_, x3::double_, qualified)
Parsing: 'willy'
Parsed willy
-------------------------
unqualified with whitespace
Rule: parser::longest(id, x3::int_, x3::double_, qualified)
Parsing: ' willy '
Parsed willy
-------------------------
integral or number
Rule: parser::longest(id, x3::int_, x3::double_, qualified)
Parsing: '123.78::anton::lutz'
Parsed 123.78
EOI not reached. Unparsed: '::anton::lutz'
-------------------------
qualified
Rule: parser::longest(id, x3::int_, x3::double_, qualified)
Parsing: 'willy::anton::lutz'
Parsed [willy][anton][lutz]
-------------------------
qualified with whitespace
Rule: parser::longest(id, x3::int_, x3::double_, qualified)
Parsing: 'willy :: anton::lutz'
Parsed [willy][anton][lutz]
-------------------------
unqualified
Rule: parser::longest(x3::int_, x3::double_, qualified)
Parsing: 'willy'
Parsed [willy]
-------------------------
unqualified with whitespace
Rule: parser::longest(x3::int_, x3::double_, qualified)
Parsing: ' willy '
Parsed [willy]
-------------------------
integral or number
Rule: parser::longest(x3::int_, x3::double_, qualified)
Parsing: '123.78::anton::lutz'
Parsed 123.78
EOI not reached. Unparsed: '::anton::lutz'
-------------------------
qualified
Rule: parser::longest(x3::int_, x3::double_, qualified)
Parsing: 'willy::anton::lutz'
Parsed [willy][anton][lutz]
-------------------------
qualified with whitespace
Rule: parser::longest(x3::int_, x3::double_, qualified)
Parsing: 'willy :: anton::lutz'
Parsed [willy][anton][lutz]
请注意不同的结果取决于备选方案中的解析器表达式。