Is this a known pitfall of C++11 for loops?

C++For LoopC++11Language LawyerForeach

C++ Problem Overview


Let's imagine we have a struct for holding 3 doubles with some member functions:

struct Vector {
  double x, y, z;
  // ...
  Vector &negate() {
    x = -x; y = -y; z = -z;
    return *this;
  }
  Vector &normalize() {
     double s = 1./sqrt(x*x+y*y+z*z);
     x *= s; y *= s; z *= s;
     return *this;
  }
  // ...
};

This is a little contrived for simplicity, but I'm sure you agree that similar code is out there. The methods allow you to conveniently chain, for example:

Vector v = ...;
v.normalize().negate();

Or even:

Vector v = Vector{1., 2., 3.}.normalize().negate();

Now if we provided begin() and end() functions, we could use our Vector in a new-style for loop, say to loop over the 3 coordinates x, y, and z (you can no doubt construct more "useful" examples by replacing Vector with e.g. String):

Vector v = ...;
for (double x : v) { ... }

We can even do:

Vector v = ...;
for (double x : v.normalize().negate()) { ... }

and also:

for (double x : Vector{1., 2., 3.}) { ... }

However, the following (it seems to me) is broken:

for (double x : Vector{1., 2., 3.}.normalize()) { ... }

While it seems like a logical combination of the previous two usages, I think this last usage creates a dangling reference while the previous two are completely fine.

  • Is this correct and Widely appreciated?
  • Which part of the above is the "bad" part, that should be avoided?
  • Would the language be improved by changing the definition of the range-based for loop such that temporaries constructed in the for-expression exist for the duration of the loop?

C++ Solutions


Solution 1 - C++

> Is this correct and Widely appreciated?

Yes, your understanding of things is correct.

> Which part of the above is the "bad" part, that should be avoided?

The bad part is taking an l-value reference to a temporary returned from a function, and binding it to an r-value reference. It is just as bad as this:

auto &&t = Vector{1., 2., 3.}.normalize();

The temporary Vector{1., 2., 3.}'s lifetime cannot be extended because the compiler has no idea that the return value from normalize references it.

> Would the language be improved by changing the definition of the range-based for loop such that temporaries constructed in the for-expression exist for the duration of the loop?

That would be highly inconsistent with how C++ works.

Would it prevent certain gotchas made by people using chained expressions on temporaries or various lazy-evaluation methods for expressions? Yes. But it would also be require special-case compiler code, as well as be confusing as to why it doesn't work with other expression constructs.

A much more reasonable solution would be some way to inform the compiler that the return value of a function is always a reference to this, and therefore if the return value is bound to a temporary-extending construct, then it would extend the correct temporary. That's a language-level solution though.

Presently (if the compiler supports it), you can make it so that normalize cannot be called on a temporary:

struct Vector {
  double x, y, z;
  // ...
  Vector &normalize() & {
     double s = 1./sqrt(x*x+y*y+z*z);
     x *= s; y *= s; z *= s;
     return *this;
  }
  Vector &normalize() && = delete;
};

This will cause Vector{1., 2., 3.}.normalize() to give a compile error, while v.normalize() will work fine. Obviously you won't be able to do correct things like this:

Vector t = Vector{1., 2., 3.}.normalize();

But you also won't be able to do incorrect things.

Alternatively, as suggested in the comments, you can make the rvalue reference version return a value rather than a reference:

struct Vector {
  double x, y, z;
  // ...
  Vector &normalize() & {
     double s = 1./sqrt(x*x+y*y+z*z);
     x *= s; y *= s; z *= s;
     return *this;
  }
  Vector normalize() && {
     Vector ret = *this;
     ret.normalize();
     return ret;
  }
};

If Vector was a type with actual resources to move, you could use Vector ret = std::move(*this); instead. Named return value optimization makes this reasonably optimal in terms of performance.

Solution 2 - C++

> for (double x : Vector{1., 2., 3.}.normalize()) { ... }

That is not a limitation of the language, but a problem with your code. The expression Vector{1., 2., 3.} creates a temporary, but the normalize function returns an lvalue-reference. Because the expression is an lvalue, the compiler assumes that the object will be alive, but because it is a reference to a temporary, the object dies after the full expression is evaluated, so you are left with a dangling reference.

Now, if you change your design to return a new object by value rather than a reference to the current object, then there would be no issue and the code would work as expected.

Solution 3 - C++

IMHO, the second example is already flawed. That the modifying operators return *this is convenient in the way you mentioned: it allows chaining of modifiers. It can be used for simply handing on the result of the modification, but doing this is error-prone because it can easily be overlooked. If I see something like

Vector v{1., 2., 3.};
auto foo = somefunction1(v, 17);
auto bar = somefunction2(true, v, 2, foo);
auto baz = somefunction3(bar.quun(v), 93.2, v.qwarv(foo));

I wouldn't automatically suspect that the functions modify v as a side-effect. Of course, they could, but it would be confusing. So if I was to write something like this, I would make sure that v stays constant. For your example, I would add free functions

auto normalized(Vector v) -> Vector {return v.normalize();}
auto negated(Vector v) -> Vector {return v.negate();}

and then write the loops

for( double x : negated(normalized(v)) ) { ... }

and

for( double x : normalized(Vector{1., 2., 3}) ) { ... }

That is IMO better readable, and it's safer. Of course, it requires an extra copy, however for heap-allocated data this could likely be done in a cheap C++11 move operation.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionndkrempelView Question on Stackoverflow
Solution 1 - C++Nicol BolasView Answer on Stackoverflow
Solution 2 - C++David Rodríguez - dribeasView Answer on Stackoverflow
Solution 3 - C++leftaroundaboutView Answer on Stackoverflow