How the new range-based for loop in C++17 helps Ranges TS?
C++C++11For LoopC++17C++ Problem Overview
The committee changed the range-based for loop from:
-
C++11:
{ auto && __range = range_expression ; for (auto __begin = begin_expr, __end = end_expr; __begin != __end; ++__begin) { range_declaration = *__begin; loop_statement } }
-
to C++17 :
{ auto && __range = range_expression ; auto __begin = begin_expr ; auto __end = end_expr ; for ( ; __begin != __end; ++__begin) { range_declaration = *__begin; loop_statement } }
And people said that this will make implementing Ranges TS easier. Can you give me some examples?
C++ Solutions
Solution 1 - C++
for
was overconstrained...
C++11/14 range-The WG21 paper for this is P0184R0 which has the following motivation:
> The existing range-based for loop is over-constrained. The end > iterator is never incremented, decremented, or dereferenced. Requiring > it to be an iterator serves no practical purpose.
As you can see from the Standardese that you posted, the end
iterator of a range is only used in the loop-condition __begin != __end;
. Hence end
only needs to be equality comparable to begin
, and it does not need to be dereferenceable or incrementable.
operator==
for delimited iterators.
...which distorts So what disadvantage does this have? Well, if you have a sentinel-delimited range (C-string, line of text, etc.), then you have to shoehorn the loop-condition into the iterator's operator==
, essentially like this
#include <iostream>
template <char Delim = 0>
struct StringIterator
{
char const* ptr = nullptr;
friend auto operator==(StringIterator lhs, StringIterator rhs) {
return lhs.ptr ? (rhs.ptr || (*lhs.ptr == Delim)) : (!rhs.ptr || (*rhs.ptr == Delim));
}
friend auto operator!=(StringIterator lhs, StringIterator rhs) {
return !(lhs == rhs);
}
auto& operator*() { return *ptr; }
auto& operator++() { ++ptr; return *this; }
};
template <char Delim = 0>
class StringRange
{
StringIterator<Delim> it;
public:
StringRange(char const* ptr) : it{ptr} {}
auto begin() { return it; }
auto end() { return StringIterator<Delim>{}; }
};
int main()
{
// "Hello World", no exclamation mark
for (auto const& c : StringRange<'!'>{"Hello World!"})
std::cout << c;
}
Live Example with g++ -std=c++14, (assembly using gcc.godbolt.org)
The above operator==
for StringIterator<>
is symmetric in its arguments and does not rely on whether the range-for is begin != end
or end != begin
(otherwise you could cheat and cut the code in half).
For simple iteration patterns, the compiler is able to optimize the convoluted logic inside operator==
. Indeed, for the above example, the operator==
is reduced to a single comparison. But will this continue to work for long pipelines of ranges and filters? Who knows. It is likely to require heroic optimization levels.
C++17 will relax the constraints which will simplify delimited ranges...
So where exactly does the simplification manifest itself? In operator==
, which now has extra overloads taking an iterator/sentinel pair (in both orders, for symmetry). So the run time logic becomes compile time logic.
#include <iostream>
template <char Delim = 0>
struct StringSentinel {};
struct StringIterator
{
char const* ptr = nullptr;
template <char Delim>
friend auto operator==(StringIterator lhs, StringSentinel<Delim> rhs) {
return *lhs.ptr == Delim;
}
template <char Delim>
friend auto operator==(StringSentinel<Delim> lhs, StringIterator rhs) {
return rhs == lhs;
}
template <char Delim>
friend auto operator!=(StringIterator lhs, StringSentinel<Delim> rhs) {
return !(lhs == rhs);
}
template <char Delim>
friend auto operator!=(StringSentinel<Delim> lhs, StringIterator rhs) {
return !(lhs == rhs);
}
auto& operator*() { return *ptr; }
auto& operator++() { ++ptr; return *this; }
};
template <char Delim = 0>
class StringRange
{
StringIterator it;
public:
StringRange(char const* ptr) : it{ptr} {}
auto begin() { return it; }
auto end() { return StringSentinel<Delim>{}; }
};
int main()
{
// "Hello World", no exclamation mark
for (auto const& c : StringRange<'!'>{"Hello World!"})
std::cout << c;
}
Live Example using g++ -std=c++1z (assembly using gcc.godbolt.org, which is almost identical to the previous example).
...and will in fact support fully general, primitive "D-style" ranges.
WG21 paper N4382 has the following suggestion:
> C.6 Range Facade and Adaptor Utilities [future.facade]
>
> 1 Until it
> becomes trivial for users to create their own iterator types, the full
> potential of iterators will remain unrealized. The range abstraction
> makes that achievable. With the right library components, it should be
> possible for users to define a range with a minimal interface (e.g.,
> current
, done
, and next
members), and have iterator types
> automatically generated. Such a range facade class template is left as
> future work.
Essentially, this is equal to D-style ranges (where these primitives are called empty
, front
and popFront
). A delimited string range with only these primitives would look something like this:
template <char Delim = 0>
class PrimitiveStringRange
{
char const* ptr;
public:
PrimitiveStringRange(char const* c) : ptr{c} {}
auto& current() { return *ptr; }
auto done() const { return *ptr == Delim; }
auto next() { ++ptr; }
};
If one does not know the underlying representation of a primitive range, how to extract iterators from it? How to adapt this to a range that can be used with range-for
? Here's one way (see also the series of blog posts by @EricNiebler) and the comments from @T.C.:
#include <iostream>
// adapt any primitive range with current/done/next to Iterator/Sentinel pair with begin/end
template <class Derived>
struct RangeAdaptor : private Derived
{
using Derived::Derived;
struct Sentinel {};
struct Iterator
{
Derived* rng;
friend auto operator==(Iterator it, Sentinel) { return it.rng->done(); }
friend auto operator==(Sentinel, Iterator it) { return it.rng->done(); }
friend auto operator!=(Iterator lhs, Sentinel rhs) { return !(lhs == rhs); }
friend auto operator!=(Sentinel lhs, Iterator rhs) { return !(lhs == rhs); }
auto& operator*() { return rng->current(); }
auto& operator++() { rng->next(); return *this; }
};
auto begin() { return Iterator{this}; }
auto end() { return Sentinel{}; }
};
int main()
{
// "Hello World", no exclamation mark
for (auto const& c : RangeAdaptor<PrimitiveStringRange<'!'>>{"Hello World!"})
std::cout << c;
}
Live Example using g++ -std=c++1z (assembly using gcc.godbolt.org)
Conclusion: sentinels are not just a cute mechanism to press delimiters into the type system, they are general enough to support primitive "D-style" ranges (which themselves may have no notion of iterators) as a zero-overhead abstraction for the new C++1z range-for.
Solution 2 - C++
The new specification allows __begin
and __end
to be of different type, as long as __end
can be compared to __begin
for inequality. __end
doesn't even need to be an iterator and can be a predicate. Here is a silly example with a struct defining begin
and end
members, the latter being a predicate instead of an iterator:
#include <iostream>
#include <string>
// a struct to get the first word of a string
struct FirstWord {
std::string data;
// declare a predicate to make ' ' a string ender
struct EndOfString {
bool operator()(std::string::iterator it) { return (*it) != '\0' && (*it) != ' '; }
};
std::string::iterator begin() { return data.begin(); }
EndOfString end() { return EndOfString(); }
};
// declare the comparison operator
bool operator!=(std::string::iterator it, FirstWord::EndOfString p) { return p(it); }
// test
int main() {
for (auto c : {"Hello World !!!"})
std::cout << c;
std::cout << std::endl; // print "Hello World !!!"
for (auto c : FirstWord{"Hello World !!!"}) // works with gcc with C++17 enabled
std::cout << c;
std::cout << std::endl; // print "Hello"
}