使用 qi::double_ 和 qi::uint_ 使用组合立即数 = string|float|int 规则时损坏的 std::cout 输出

Broken std::cout output when using combined immediate = string|float|int rule using qi::double_ an qi::uint_

我尝试获取字符串、整数和浮点数的即时规则,以便我可以解析以下测试

 //strings
 "\"hello\"",
 "   \"  hello \"  ",
 "  \"  hello \"\"stranger\"\" \"  ",
 //ints
 "1",
 "23",
 "456",
 //floats
 "3.3",
 "34.35"

在线试用:http://coliru.stacked-crooked.com/a/26fbd691876d9a8f

使用

qi::rule<std::string::const_iterator, std::string()> 
  double_quoted_string = '"' >> *("\"\"" >> qi::attr('"') | ~qi::char_('"')) >> '"';

qi::rule<std::string::const_iterator, std::string()> 
  number = (+qi::ascii::digit >> *(qi::char_('.') >> +qi::ascii::digit));

qi::rule<std::string::const_iterator, std::string()>
  immediate = double_quoted_string | number;

给了我正确的结果 - 但我需要使用 double_ parse 因为 我想支持指数符号,NaN 等

但使用

qi::rule<std::string::const_iterator, std::string()>
  immediate = double_quoted_string | qi::uint_ | qi::double_;

打印整数值

"1" OK: ''
----
"23" OK: ''
----
"456" OK: '�'

双数完全无法解析

在 Coliru、Win7x64 VS2017 最新、LLVM clang-cl 下测试

有时 Colliru 会发出过多警告,导致编译停止

知道这里发生了什么吗?

精神上的警告是否通常意味着 - 停在这里,严重损坏的东西?

UPDATE:如果我只使用 double_,在我测试它并且行为改变 with/without uint_ 解析器之前,也会发生这种情况 尝试:https://wandbox.org/permlink/UqgItWkfC2I8tkNF

在整数和双浮点解析器上使用 qi::raw,以便按词法转换数字:qi::raw[qi::uint_]qi::raw[qi::double_].

但解析的顺序也很重要。如果 uint_ 解析器在 double_ 之前,如下所示:

immediate = double_quoted_string | qi::raw[qi::uint_] | qi::raw[qi::double_];
BOOST_SPIRIT_DEBUG_NODES((immediate)); // for debug output

然后 uint_ 解析器将部分使用双精度浮点数,然后整个解析将失败:

<immediate>
  <try>34.35</try>
  <success>.35</success> //<----- this is what is left after uint_ parsed
  <attributes>[[3, 4]]</attributes> // <---- what uint_ parser successfully parsed
</immediate>
"34.35" Failed
Remaining unparsed: "34.35"

交换 uint_double_ 的顺序后:

immediate = double_quoted_string | qi::raw[qi::double_] | qi::raw[qi::uint_];

结果:

"\"hello\"" OK: 'hello'
----
"   \"  hello \"  " OK: '  hello '
----
"  \"  hello \"\"stranger\"\" \"  " OK: '  hello "stranger" '
----
"1" OK: '1'
----
"64" OK: '64'
----
"456" OK: '456'
----
"3.3" OK: '3.3'
----
"34.35" OK: '34.35'
----

"parsing" 的宽泛定义是将文本表示转换为 "another"(通常更 原生 )表示。

将一个数字 "parse" 转换为 std::string 没有任何意义。您所看到的是自动属性传播,它非常努力地试图理解它(通过将解析的数字作为字符粘贴到字符串中)。

这不是你想要的。相反,您想解析整数值或双精度值。为此,您可以简单地声明一个变体属性类型:

using V = boost::variant<std::string, double, unsigned int>;
qi::rule<std::string::const_iterator, V()>
    immediate = double_quoted_string | qi::double_ | qi::uint_;

就是这样。现场演示,对结果添加类型检查:

Live On Coliru

#include <iostream>
#include <iomanip>
#include <boost/spirit/include/qi.hpp>

namespace qi = boost::spirit::qi;
using namespace std::string_literals;

int main() {
    for (auto&& [str, type] : std::vector {
        std::pair("\"hello\""s,                typeid(std::string).name()),
        {"   \"  hello \"  "s,                 typeid(std::string).name()},
        {"  \"  hello \"\"stranger\"\" \"  "s, typeid(std::string).name()},
        {"1"s,                                 typeid(unsigned int).name()},
        {"23"s,                                typeid(unsigned int).name()},
        {"456"s,                               typeid(unsigned int).name()},
        {"3.3"s,                               typeid(double).name()},
        {"34.35"s,                             typeid(double).name()},
    }) {
        auto iter = str.cbegin(), end = str.cend();

        qi::rule<std::string::const_iterator, std::string()> double_quoted_string
            = '"' >> *("\"\"" >> qi::attr('"') | ~qi::char_('"')) >> '"';

        using V = boost::variant<std::string, double, unsigned int>;
        qi::rule<std::string::const_iterator, V()> immediate
            = double_quoted_string | qi::double_ | qi::uint_;

        std::cout << std::quoted(str) << " ";

        V res;
        bool r = qi::phrase_parse(iter, end, immediate, qi::blank, res);
        bool typecheck = (type == res.type().name());

        if (r) {
            std::cout << "OK: " << res << " typecheck " << (typecheck?"MATCH":"MISMATCH") << "\n";
        } else {
            std::cout << "Failed\n";
        }
        if (iter != end) {
            std::cout << "Remaining unparsed: " << std::quoted(std::string(iter, end)) << "\n";
        }
        std::cout << "----\n";
    }
}

版画

"\"hello\"" OK: hello typecheck MATCH
----
"   \"  hello \"  " OK:   hello  typecheck MATCH
----
"  \"  hello \"\"stranger\"\" \"  " OK:   hello "stranger"  typecheck MATCH
----
"1" OK: 1 typecheck MISMATCH
----
"23" OK: 23 typecheck MISMATCH
----
"456" OK: 456 typecheck MISMATCH
----
"3.3" OK: 3.3 typecheck MATCH
----
"34.35" OK: 34.35 typecheck MATCH
----

Note the re-ordering of uint_ after double_. If you parse integers first, it will parse the integer part of a double until the decimal separator, and then fail to parse the rest. To be more accurate, you may want to use a strict real parser so that only number that actual have a fraction get parsed as doubles. This does limit the range for integral numbers because unsigned int has a far smaller range than double.

See Parse int or double using boost spirit (longest_d)

Live On Coliru

    qi::rule<std::string::const_iterator, V()> immediate
        = double_quoted_string
        | qi::real_parser<double, qi::strict_real_policies<double> >{}
        | qi::uint_;

版画

"\"hello\"" OK: hello typecheck MATCH
----
"   \"  hello \"  " OK:   hello  typecheck MATCH
----
"  \"  hello \"\"stranger\"\" \"  " OK:   hello "stranger"  typecheck MATCH
----
"1" OK: 1 typecheck MATCH
----
"23" OK: 23 typecheck MATCH
----
"456" OK: 456 typecheck MATCH
----
"3.3" OK: 3.3 typecheck MATCH
----
"34.35" OK: 34.35 typecheck MATCH
----