C++17 中新的基于范围的 for 循环如何帮助 Ranges TS?

How the new range-based for loop in C++17 helps Ranges TS?

委员会更改了基于范围的 for 循环:

人们说这将使实施 Ranges TS 变得更加容易。你能给我一些例子吗?

新的规范允许__begin__end是不同的类型,只要__end__begin不等即可。 __end 甚至不需要是迭代器,可以是谓词。这是一个愚蠢的例子,其中一个结构定义了 beginend 成员,后者是谓词而不是迭代器:

#include <iostream>
#include <string>

// a struct to get the first word of a string

struct FirstWord {
    std::string data;

    // declare a predicate to make ' ' a string ender

    struct EndOfString {
        bool operator()(std::string::iterator it) { return (*it) != '[=10=]' && (*it) != ' '; }
    };

    std::string::iterator begin() { return data.begin(); }
    EndOfString end() { return EndOfString(); }
};

// declare the comparison operator

bool operator!=(std::string::iterator it, FirstWord::EndOfString p) { return p(it); }

// test

int main() {
    for (auto c : {"Hello World !!!"})
        std::cout << c;
    std::cout << std::endl; // print "Hello World !!!"

    for (auto c : FirstWord{"Hello World !!!"}) // works with gcc with C++17 enabled
        std::cout << c;
    std::cout << std::endl; // print "Hello"
}

C++11/14 范围-for 被过度约束...

WG21 的论文是 P0184R0,其动机如下:

The existing range-based for loop is over-constrained. The end iterator is never incremented, decremented, or dereferenced. Requiring it to be an iterator serves no practical purpose.

从您发布的标准语可以看出,范围的 end 迭代器仅在循环条件 __begin != __end; 中使用。因此 end 只需要与 begin 相等即可,不需要可解引用或递增。

...这扭曲了分隔迭代器的 operator==

那么这有什么缺点呢?好吧,如果你有一个标记分隔的范围(C 字符串、文本行等),那么你必须将循环条件硬塞进迭代器的 operator==,基本上像这样

#include <iostream>

template <char Delim = 0>
struct StringIterator
{
    char const* ptr = nullptr;   

    friend auto operator==(StringIterator lhs, StringIterator rhs) {
        return lhs.ptr ? (rhs.ptr || (*lhs.ptr == Delim)) : (!rhs.ptr || (*rhs.ptr == Delim));
    }

    friend auto operator!=(StringIterator lhs, StringIterator rhs) {
        return !(lhs == rhs);
    }

    auto& operator*()  {        return *ptr;  }
    auto& operator++() { ++ptr; return *this; }
};

template <char Delim = 0>
class StringRange
{
    StringIterator<Delim> it;
public:
    StringRange(char const* ptr) : it{ptr} {}
    auto begin() { return it;                      }
    auto end()   { return StringIterator<Delim>{}; }
};

int main()
{
    // "Hello World", no exclamation mark
    for (auto const& c : StringRange<'!'>{"Hello World!"})
        std::cout << c;
}

Live Example with g++ -std=c++14, (assembly 使用 gcc.godbolt.org)

上面的 operator== for StringIterator<> 在其参数中是对称的,并且不依赖于范围是 begin != end 还是 end != begin (否则你可以作弊并将代码减半)。

对于简单的迭代模式,编译器能够优化 operator== 中复杂的逻辑。实际上,对于上面的示例,operator== 被简化为单个比较。但这是否会继续适用于范围和过滤器的长管道?谁知道。它可能需要英雄优化级别。

C++17 将放宽限制,从而简化分隔范围...

那么简化具体体现在哪里呢?在 operator== 中,现在有额外的重载采用 iterator/sentinel 对(在两个顺序中,为了对称)。所以 运行 时间逻辑变成了编译时逻辑。

#include <iostream>

template <char Delim = 0>
struct StringSentinel {};

struct StringIterator
{
    char const* ptr = nullptr;   

    template <char Delim>
    friend auto operator==(StringIterator lhs, StringSentinel<Delim> rhs) {
        return *lhs.ptr == Delim;
    }

    template <char Delim>
    friend auto operator==(StringSentinel<Delim> lhs, StringIterator rhs) {
        return rhs == lhs;
    }

    template <char Delim>
    friend auto operator!=(StringIterator lhs, StringSentinel<Delim> rhs) {
        return !(lhs == rhs);
    }

    template <char Delim>
    friend auto operator!=(StringSentinel<Delim> lhs, StringIterator rhs) {
        return !(lhs == rhs);
    }

    auto& operator*()  {        return *ptr;  }
    auto& operator++() { ++ptr; return *this; }
};

template <char Delim = 0>
class StringRange
{
    StringIterator it;
public:
    StringRange(char const* ptr) : it{ptr} {}
    auto begin() { return it;                      }
    auto end()   { return StringSentinel<Delim>{}; }
};

int main()
{
    // "Hello World", no exclamation mark
    for (auto const& c : StringRange<'!'>{"Hello World!"})
        std::cout << c;
}

Live Example using g++ -std=c++1z (assembly 使用 gcc.godbolt.org,这与前面的示例几乎相同。

...并且实际上将支持完全通用的原始 "D-style" 范围。

WG21 论文 N4382 有以下建议:

C.6 Range Facade and Adaptor Utilities [future.facade]

1 Until it becomes trivial for users to create their own iterator types, the full potential of iterators will remain unrealized. The range abstraction makes that achievable. With the right library components, it should be possible for users to define a range with a minimal interface (e.g., current, done, and next members), and have iterator types automatically generated. Such a range facade class template is left as future work.

本质上,这等同于 D 样式范围(其中这些原语称为 emptyfrontpopFront)。仅包含这些基元的分隔字符串范围看起来像这样:

template <char Delim = 0>
class PrimitiveStringRange
{
    char const* ptr;
public:    
    PrimitiveStringRange(char const* c) : ptr{c} {}
    auto& current()    { return *ptr;          }
    auto  done() const { return *ptr == Delim; }
    auto  next()       { ++ptr;                }
};

如果不知道基本范围的底层表示,如何从中提取迭代器?如何将其调整为可以与 range-for 一起使用的范围?这是一种方法(另请参阅@EricNiebler 的 series of blog posts)和@T.C 的评论。:

#include <iostream>

// adapt any primitive range with current/done/next to Iterator/Sentinel pair with begin/end
template <class Derived>
struct RangeAdaptor : private Derived
{      
    using Derived::Derived;

    struct Sentinel {};

    struct Iterator
    {
        Derived*  rng;

        friend auto operator==(Iterator it, Sentinel) { return it.rng->done(); }
        friend auto operator==(Sentinel, Iterator it) { return it.rng->done(); }

        friend auto operator!=(Iterator lhs, Sentinel rhs) { return !(lhs == rhs); }
        friend auto operator!=(Sentinel lhs, Iterator rhs) { return !(lhs == rhs); }

        auto& operator*()  {              return rng->current(); }
        auto& operator++() { rng->next(); return *this;          }
    };

    auto begin() { return Iterator{this}; }
    auto end()   { return Sentinel{};     }
};

int main()
{
    // "Hello World", no exclamation mark
    for (auto const& c : RangeAdaptor<PrimitiveStringRange<'!'>>{"Hello World!"})
        std::cout << c;
}

Live Example using g++ -std=c++1z (assembly 使用 gcc.godbolt.org)

结论:哨兵不仅仅是一种将定界符压入类型系统的可爱机制,它们的通用性足以support primitive "D-style" ranges(它们本身可能没有迭代器)作为新 C++1z range-for 的零开销抽象。