In practice, why would different compilers compute different values of int x = ++i + ++i;?

C++Undefined Behavior

C++ Problem Overview


Consider this code:

int i = 1;
int x = ++i + ++i;

We have some guesses for what a compiler might do for this code, assuming it compiles.

  1. both ++i return 2, resulting in x=4.
  2. one ++i returns 2 and the other returns 3, resulting in x=5.
  3. both ++i return 3, resulting in x=6.

To me, the second seems most likely. One of the two ++ operators is executed with i = 1, the i is incremented, and the result 2 is returned. Then the second ++ operator is executed with i = 2, the i is incremented, and the result 3 is returned. Then 2 and 3 are added together to give 5.

However, I ran this code in Visual Studio, and the result was 6. I'm trying to understand compilers better, and I'm wondering what could possibly lead to a result of 6. My only guess is that the code could be executed with some "built-in" concurrency. The two ++ operators were called, each incremented i before the other returned, and then they both returned 3. This would contradict my understanding of the call stack, and would need to be explained away.

What (reasonable) things could a C++ compiler do that would lead to a result of 4 or a result or 6?

Note

This example appeared as an example of undefined behavior in Bjarne Stroustrup's Programming: Principles and Practice using C++ (C++ 14).

See cinnamon's comment.

C++ Solutions


Solution 1 - C++

The compiler takes your code, splits it into very simple instructions, and then recombines and arranges them in a way that it thinks optimal.

The code

int i = 1;
int x = ++i + ++i;

consists of the following instructions:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
5. read i as tmp2
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

But despite this being a numbered list the way I wrote it, there are only a few ordering dependencies here: 1->2->3->4->5->10->11 and 1->6->7->8->9->10->11 must stay in their relative order. Other than that the compiler can freely reorder, and perhaps eliminate redundancy.

For example, you could order the list like this:

1. store 1 in i
2. read i as tmp1
6. read i as tmp3
3. add 1 to tmp1
7. add 1 to tmp3
4. store tmp1 in i
8. store tmp3 in i
5. read i as tmp2
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

Why can the compiler do this? Because there's no sequencing to the side effects of the increment. But now the compiler can simplify: for example, there's a dead store in 4: the value is immediately overwritten. Also, tmp2 and tmp4 are really the same thing.

1. store 1 in i
2. read i as tmp1
6. read i as tmp3
3. add 1 to tmp1
7. add 1 to tmp3
8. store tmp3 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

And now everything to do with tmp1 is dead code: it's never used. And the re-read of i can be eliminated too:

1. store 1 in i
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
10. add tmp3 and tmp3, as tmp5
11. store tmp5 in x

Look, this code is much shorter. The optimizer is happy. The programmer is not, because i was only incremented once. Oops.

Let's look at something else the compiler can do instead: let's go back to the original version.

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
5. read i as tmp2
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

The compiler could reorder it like this:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
5. read i as tmp2
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

and then notice again that i is read twice, so eliminate one of them:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

That's nice, but it can go further: it can reuse tmp1:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
6. read i as tmp1
7. add 1 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

Then it can eliminate the re-read of i in 6:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
7. add 1 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

Now 4 is a dead store:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
7. add 1 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

and now 3 and 7 can be merged into one instruction:

1. store 1 in i
2. read i as tmp1
3+7. add 2 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

Eliminate the last temporary:

1. store 1 in i
2. read i as tmp1
3+7. add 2 to tmp1
8. store tmp1 in i
10. add tmp1 and tmp1, as tmp5
11. store tmp5 in x

And now you get the result that Visual C++ is giving you.

Note that in both optimization paths, the important order dependencies were preserved, insofar as the instructions weren't removed for doing nothing.

Solution 2 - C++

While this is UB (as the OP implied), following are hypothetical ways a compiler could get the 3 results. All three would give the same correct x result if used with different int i = 1, j = 1; variables instead of one and the same i.

> 1. both ++i return 2, resulting in x=4.

int i = 1;
int i1 = i, i2 = i;   // i1 = i2 = 1
++i1;                 // i1 = 2
++i2;                 // i2 = 2
int x = i1 + i2;      // x = 4

> 2. one ++i returns 2 and the other returns 3, resulting in x=5.

int i = 1;
int i1 = ++i;           // i1 = 2
int i2 = ++i;           // i2 = 3
int x = i1 + i2;        // x = 5

> 3. both ++i return 3, resulting in x=6.

int i = 1;
int &i1 = i, &i2 = i;
++i1;                   // i = 2
++i2;                   // i = 3
int x = i1 + i2;        // x = 6

Solution 3 - C++

> To me, the second seems most likely.

I am going for option #4: Both ++i happen concurrently.

Newer processors move toward some interesting optimizations and parallel code evaluation, where allowed as here, is another way compilers keep making faster code. I see as a practical implementation, compilers moving toward parallelism.

I could readily see a race condition causing non-deterministic behavior or a bus fault due to same memory contention - all allowed as the coder violated the C++ contract - hence UB.

> My question is: what (reasonable) things could a C++ compiler do that would lead to a result of 4 or a result or 6?

It could, but do not count in it.

Don't use ++i + ++i nor expect sensible results.

Solution 4 - C++

I think that a simple and straightforward interpretation (without any bid to compiler optimizations or multithreading) would be just:

  1. Increment i
  2. Increment i
  3. Add i + i

With i incremented twice, its value is 3, and when added together, the sum is 6.

For inspection, consider this as a C++ function:

int dblInc ()
{
	int i = 1;
	int x = ++i + ++i;
	return x;	
}

Now, here's the assembly code I get from compiling that function, using an old version of the GNU C++ compiler (win32, gcc version 3.4.2 (mingw-special)). There's no fancy optimizations or multithreading happening here:

__Z6dblIncv:
	push	ebp
	mov	ebp, esp
	sub	esp, 8
	mov	DWORD PTR [ebp-4], 1
	lea	eax, [ebp-4]
	inc	DWORD PTR [eax]
	lea	eax, [ebp-4]
	inc	DWORD PTR [eax]
	mov	eax, DWORD PTR [ebp-4]
	add	eax, DWORD PTR [ebp-4]
	mov	DWORD PTR [ebp-8], eax
	mov	eax, DWORD PTR [ebp-8]
	leave
	ret

Note that local variable i is sitting on the stack in just one single place: address [ebp-4]. That location is incremented twice (in the 5th-8th lines of the assembly function; including apparently redundant loads of that address into eax). Then on the 9th-10th lines, that value is loaded into eax, and then added into eax (that is, computes the current i + i). Then it's redundantly copied to the stack and back to eax as the return value (which will obviously be 6).

It may be of interest to look at the C++ standard (here, an old one: ISO/IEC 14882:1998(E)) which says for Expressions, section 5.4:

> Except where noted, the order of evaluation of operands of individual > operators and subexpressions of individual expressions, and the order > in which side effects take place, is unspecified.

With the footnote:

> The precedence of operators is not directly specified, but it can be > derived from the syntax.

Two examples of unspecified behavior are given at that point, both involving the increment operator (one of which is: i = ++i + 1).

Now, if one wished, one could: Make an integer wrapper class (like a Java Integer); overload functions operator+ and operator++ such that they return the intermediate value objects; and thereby write ++iObj + ++iObj and get it to return an object holding 5. (I haven't included full code here for the sake of brevity.)

Personally, I'd by intrigued if there was an example of a well-known compiler that did the job any other way than the sequence seen above. It seems to me like the most straightforward implementation would be to just do two assembly-code incs on the primitive type before the addition operation is performed.

Solution 5 - C++

The reasonable thing that a compiler can do is Common Subexpression Elimination. This is already a common optimisation in compilers: if a subexpression like (x+1) occurs more than once in a larger expression, it only needs to be calculated once. E.g. in a/(x+1) + b*(x+1) the x+1 sub-expression can be calculated once.

Of course, the compiler has to know which sub-expressions can be optimised that way. Calling rand() twice should give two random numbers. Non-inlined function calls must therefore be exempt from CSE. As you note, there is no rule that says how two occurrences of i++ should be handled, so there's no reason to exempt them from CSE.

The result may indeed be that int x = ++i + ++i; is optimised to int __cse = i++; int x = __cse << 1. (CSE, followed by repeated strength reduction)

Solution 6 - C++

There is no reasonable thing a compiler could do to get a result of 6, but it's possible and legitimate. A result of 4 is entirely reasonable, and I'd consider a result of 5 borderline reasonable. All of them are perfectly legal.

Hey, wait! Isn't it clear what must happen? The addition needs the results of the two increments, so obviously these must happen first. And we go left to right, so... argh! If only it was so simple. Unluckily, that's not the case. We do not go left to right, and that's the problem.

Reading the memory location into two registers (or initializing them both from the same literal, optimizing out the round trip to memory) is a very reasonable thing for the compiler to do. This will effectively have the effect of there covertly being two different variables, each with a value of 2, which will finally be added to a result of 4. This is "reasonable" because it's fast and efficient, and it is in accordance with both the standard and with the code.

Similarly, the memory location could be read once (or the variable initialized from the literal) and incremented once, and a shadow copy in another register could be incremented after that, which would result in 2 and 3 being added together. This is, I would say, borderline reasonable, although perfectly legal. I deem it borderline reasonable because it isn't one or the other. It's neither the "reasonable" optimized way, nor is it the "reasonable" exactly-pedantic way. It's somewhat in the middle.

Incrementing the memory location twice (resulting in a value of 3) and then adding that value to itself for a final result of 6 is legitimate, but not quite reasonable as doing memory round trips isn't precisely efficient. Although on a processor with good store forwarding, it might as well be "reasonable" to do it, since the store should be mostly invisible...
As the compiler "knows" that it's the same location, it might as well choose to increment the value twice within a register, and then add it to itself, too. Either approach would give you the result of 6.

The compiler is, by the wording of the standard, allowed to give you any such result, although I would personally consider 6 pretty much a "fuck you" memo from the Obnoxious Department, as it is a rather unexpected thing (legal or not, trying to always give the least amount of surprises is a good thing to do!). Though, seeing how Undefined Behavior is involved, sadly one cannot really argue about "unexpected", eh.

So, actually, what is the code that you have there, to the compiler? Let's ask clang, which will show us if we ask nicely (invoking with -ast-dump -fsyntax-only):

ast.cpp:4:9: warning: multiple unsequenced modifications to 'i' [-Wunsequenced]
int x = ++i + ++i;
        ^     ~~
(some lines omitted)
`-CompoundStmt 0x2b3e628 <line:2:1, line:5:1>
  |-DeclStmt 0x2b3e4b8 <line:3:1, col:10>
  | `-VarDecl 0x2b3e430 <col:1, col:9> col:5 used i 'int' cinit
  |   `-IntegerLiteral 0x2b3e498 <col:9> 'int' 1
  `-DeclStmt 0x2b3e610 <line:4:1, col:18>
    `-VarDecl 0x2b3e4e8 <col:1, col:17> col:5 x 'int' cinit
      `-BinaryOperator 0x2b3e5f0 <col:9, col:17> 'int' '+'
        |-ImplicitCastExpr 0x2b3e5c0 <col:9, col:11> 'int' <LValueToRValue>
        | `-UnaryOperator 0x2b3e570 <col:9, col:11> 'int' lvalue prefix '++'
        |   `-DeclRefExpr 0x2b3e550 <col:11> 'int' lvalue Var 0x2b3e430 'i' 'int'
        `-ImplicitCastExpr 0x2b3e5d8 <col:15, col:17> 'int' <LValueToRValue>
          `-UnaryOperator 0x2b3e5a8 <col:15, col:17> 'int' lvalue prefix '++'
            `-DeclRefExpr 0x2b3e588 <col:17> 'int' lvalue Var 0x2b3e430 'i' 'int'

As you can see, the same lvalue Var 0x2b3e430 has prefix ++ applied at two locations, and these two are below the same node in the tree, which happens to be a very non-special operator (+) that has nothing special being said about sequencing or such. Why is this important? Well, read on.

Note the warning: "multiple unsequenced modifications to 'i'". Oh oh, that doesn't sound good. What does it mean? [basic.exec] tells us about side effects and sequencing, and it tells us (paragraph 10) that by default, unless explicitly said otherwise, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. Well, darn, that's the case with operator+ -- nothing is being said otherwise, so...

But do we care about sequenced-before, indeterminately-sequenced, or unsequenced? Who wants to know, anyway?

That same paragraph also tells us that unsequenced evaluations may overlap and that when they refer to the same memory location (that's the case!) and that one is not potentially concurrent, then the behavior is undefined. This is where it really gets ugly because that means you know nothing, and you have no guarantees about being "reasonable" whatsoever. The unreasonable thing is actually perfectly allowable and "reasonable".

Solution 7 - C++

In practice, you are invoking undefined behavior. Anything can happen, not just things that you consider "reasonable", and often things do happen that you don't consider reasonable. Everything is by definition "reasonable".

A very reasonable compilation is that the compiler observes that executing a statement will invoke undefined behavior, therefore the statement cannot be executed, therefore it is translated to an instruction that intentionally crashes your application. That is very reasonable.

Downvoter: GCC strongly disagrees with you.

Solution 8 - C++

There is a rule:

> Between the previous and next sequence point a scalar object must have > its stored value modified at most once by the evaluation of an > expression, otherwise the behavior is undefined.

Thus even x = 100 is a possible valid result.

For me the most logical result in the example is 6, because we are increasing the value of i twice and them add it to itself. It is difficult to do addition before the calculation values from both sides of "+".

But compiler developers can implement any other logic.

Solution 9 - C++

It looks like ++i returns an lvalue but i++ returns an rvalue.
So this code is ok:

int i = 1;
++i = 10;
cout << i << endl;

This one is not:

int i = 1;
i++ = 10;
cout << i << endl;

The above two statements are consistent with VisualC++, GCC7.1.1, CLang and Embarcadero.
That is why your code in VisualC++ and GCC7.1.1 is similar to following one

int i = 1;
... do something there for instance: ++i; ++i; ...
int x = i + i;

When looking at disassembly, it first increments i, rewrites i. When trying to add it does the same thing, increments i and rewrites it. Then adds i to i.
enter image description here
I've noticed CLang and Embarcadero acts differently. So it is not consistent with the first statement, after first ++i it stores the result in an rvalue and then add it to second i++.

Solution 10 - C++

I personally would never have expected a compiler to output 6 in your example. There are already good and detailed answers to your question. I will try a short version.

Basically, ++i is a 2-step process in this context:

  1. Increment the value of i
  2. Read the value of i

In the context of ++i + ++i the two sides of the addition may be evaluated in any order according to the standard. This means the two increments are considered independent. Also, there is no dependency between the two terms. The increment and read of i may therefore be interleaved. This gives the potential order:

  1. Increment i for the left operand
  2. Increment i for the right operand
  3. Read back i for the left operand
  4. Read back i for the right operand
  5. Sum the two: yields 6

Now, that I think about this, 6 makes the most sense according to the standard. For a result of 4 we need a CPU which first reads i independently, then increments and writes the value back into the same location; basically a race condition. For a value of 5 we need a compiler which introduces temporaries.

But, the standard says that ++i increments the variable before returning it, i.e. before actually executing the current code line. The sum operator + needs to sum i + i after applying the increments. I would say that C++ needs to work on the variables and not on a value semantic. Hence, to me 6 makes now the most sense as it relies on semantics of the language and not the execution model of CPUs.

Solution 11 - C++

#include <stdio.h>


void a1(void)
{
    int i = 1;
    int x = ++i;
    printf("i=%d\n",i);
    printf("x=%d\n",x);
    x = x + ++i;    // Here
    printf("i=%d\n",i);
    printf("x=%d\n",x);
}


void b2(void)
{
    int i = 1;
    int x = ++i;
    printf("i=%d\n",i);
    printf("x=%d\n",x);
    x = i + ++i;    // Here
    printf("i=%d\n",i);
    printf("x=%d\n",x);
}


void main(void)
{
    a1();
    // b2();
}

Solution 12 - C++

well it depends on the design of the compiler.Therefore the answer will depend on the way the compiler decodes the statements.Using two different variables ++x and ++y instead to create a logic would be a better choice. note:the ouput depends on the version of latest version of language in ms visual studio if its updated.So if the rules have changed so will the output

Solution 13 - C++

Try This

int i = 1;
int i1 = i, i2 = i;   // i1 = i2 = 1
++i1;                 // i1 = 2
++i2;                 // i2 = 2
int x = i1 + i2;      // x = 4

Solution 14 - C++

From this link order of evaluation :

> Order of evaluation of the operands of any C operator, including the > order of evaluation of function arguments in a function-call > expression, and the order of evaluation of the subexpressions within > any expression is unspecified (except where noted below). The compiler > will evaluate them in any order, and may choose another order when the > same expression is evaluated again.

From, the quotes, it is clear that, the order of evaluation is unspecified by C standards. Different compilers implement different orders of evalauation. The compiler is free to evaluate such expressions in any order. That's why different compilers give different output for expression mentioned in the question.

But, if a sequence point is present between the subexpressions Exp1 and Exp2, then both value computation and side effects of Exp1 are sequenced-before every value computation and side effect of Exp2.

Solution 15 - C++

In practice, you are invoking undefined behaviour. Anything can happen, not just things that you consider "reasonable", and often things do happen that you don't consider reasonable. Everything is by definition "reasonable".

A very reasonable compilation is that the compiler observes that executing a statement will invoke undefined behaviour, therefore the statement cannot be ever executed, therefore it is translated to an instruction that intentionally crashes your application. That is very reasonable. After all, the compiler knows that this crash can never happen.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestioncinnamonView Question on Stackoverflow
Solution 1 - C++Sebastian RedlView Answer on Stackoverflow
Solution 2 - C++dxivView Answer on Stackoverflow
Solution 3 - C++chux - Reinstate MonicaView Answer on Stackoverflow
Solution 4 - C++Daniel R. CollinsView Answer on Stackoverflow
Solution 5 - C++MSaltersView Answer on Stackoverflow
Solution 6 - C++DamonView Answer on Stackoverflow
Solution 7 - C++gnasher729View Answer on Stackoverflow
Solution 8 - C++SlavenskijView Answer on Stackoverflow
Solution 9 - C++armagedescuView Answer on Stackoverflow
Solution 10 - C++SimonView Answer on Stackoverflow
Solution 11 - C++John LinqView Answer on Stackoverflow
Solution 12 - C++samView Answer on Stackoverflow
Solution 13 - C++Ali ChraghiView Answer on Stackoverflow
Solution 14 - C++Krishna Kanth YenumulaView Answer on Stackoverflow
Solution 15 - C++gnasher729View Answer on Stackoverflow