For { A=a; B=b; }, will "A=a" be strictly executed before "B=b"?

C++COptimizationCompiler ConstructionStandards

C++ Problem Overview


Suppose A, B, a, and b are all variables, and the addresses of A, B, a, and b are all different. Then, for the following code:

A = a;
B = b;

Do the C and C++ standard explicitly require A=a be strictly executed before B=b? Given that the addresses of A, B, a, and b are all different, are compilers allowed to swap the execution sequence of two statements for some purpose such as optimization?

If the answer to my question is different in C and C++, I would like to know both.

Edit: The background of the question is the following. In board game AI design, for optimization people use lock-less shared-hash table, whose correctness strongly depends on the execution order if we do not add volatile restriction.

C++ Solutions


Solution 1 - C++

Both standards allow for these instructions to be performed out of order, so long as that does not change observable behaviour. This is known as the as-if rule:

Note that as is pointed out in the comments, what is meant by "observable behaviour" is the observable behaviour of a program with defined behaviour. If your program has undefined behaviour, then the compiler is excused from reasoning about that.

Solution 2 - C++

The compiler is only obligated to emulate the observable behavior of a program, so if a re-ordering would not violate that principle then it would be allowed. Assuming the behavior is well defined, if your program contains undefined behavior such as a data race then the behavior of the program will be unpredictable and as commented would require use of some form of synchronization to protect the critical section.

A Useful reference

An interesting article that covers this is Memory Ordering at Compile Time and it says:

> The cardinal rule of memory reordering, which is universally followed > by compiler developers and CPU vendors, could be phrased as follows: > >>Thou shalt not modify the behavior of a single-threaded program.

An Example

The article provides a simple program where we can see this reordering:

int A, B;  // Note: static storage duration so initialized to zero

void foo()
{
    A = B + 1;
    B = 0;
}

and shows at higher optimization levels B = 0 is done before A = B + 1, and we can reproduce this result using godbolt, which while using -O3 produces the following (see it live):

movl	$0, B(%rip)	#, B
addl	$1, %eax	#, D.1624

Why?

Why does the compiler reorder? The article explains it is exactly the same reason the processor does so, because of complexity of the architecture:

> As I mentioned at the start, the compiler modifies the order of memory > interactions for the same reason that the processor does it – > performance optimization. Such optimizations are a direct consequence > of modern CPU complexity.

Standards

In the draft C++ standard this is covered in section 1.9 Program execution which says (emphasis mine going forward):

> The semantic descriptions in this International Standard define a > parameterized nondeterministic abstract machine. This International > Standard places no requirement on the structure of conforming > implementations. In particular, they need not copy or emulate the > structure of the abstract machine. Rather, conforming implementations > are required to emulate (only) the observable behavior of the abstract > machine as explained below.5

footnote 5 tells us this is also known as the as-if rule:

> This provision is sometimes called the “as-if” rule, because an > implementation is free to disregard any requirement of this > International Standard as long as the result is as if the requirement > had been obeyed, as far as can be determined from the observable > behavior of the program. For instance, an actual implementation need > not evaluate part of an expression if it can deduce that its value is > not used and that no side effects affecting the observable behavior of > the program are produced.

the draft C99 and draft C11 standard covers this in section 5.1.2.3 Program execution although we have to go to the index to see that it is called the as-if rule in the C standard as well:

>as−if rule, 5.1.2.3

Update on Lock-Free considerations

The article An Introduction to Lock-Free Programming covers this topic well and for the OPs concerns on lock-less shared-hash table implementation this section is probably the most relevant:

> Memory Ordering > > As the flowchart suggests, any time you do lock-free programming for > multicore (or any symmetric multiprocessor), and your environment does > not guarantee sequential consistency, you must consider how to prevent > memory reordering. > > On today’s architectures, the tools to enforce correct memory ordering > generally fall into three categories, which prevent both compiler > reordering and processor reordering: > >- A lightweight sync or fence instruction, which I’ll talk about in future posts; >- A full memory fence instruction, which I’ve demonstrated previously; >- Memory operations which provide acquire or release semantics. > > Acquire semantics prevent memory reordering of operations which follow > it in program order, and release semantics prevent memory reordering > of operations preceding it. These semantics are particularly suitable > in cases when there’s a producer/consumer relationship, where one > thread publishes some information and the other reads it. I’ll also > talk about this more in a future post.

Solution 3 - C++

If there is no dependency of instructions, these may be executed out of order also if final outcome is not affected. You can observe this while debugging a code compiled at higher optimization level.

Solution 4 - C++

Since A = a; and B = b; are independent in terms of data dependencies, this should not matter. If there was an output/outcome of previous instruction affecting the subsequent instruction's input, then ordering matters, otherwise not. this is strictly sequential execution normally.

Solution 5 - C++

My read is that this is required to work by the C++ standard; however if you're trying to use this for multithreading control, it doesn't work in that context because there is nothing here to guarantee the registers get written to memory in the right order.

As your edit indicates, you are trying to use it exactly where it will not work.

Solution 6 - C++

It may be of interest that if you do this:

{ A=a, B=b; /*etc*/ }

Note the comma in place of the semi-colon.

Then the C++ specification and any confirming compiler will have to guarantee the execution order because operands of the comma operator are always evaluated left to right. This can indeed be used to prevent the optimizer from subverting your thread synchronization by reordering. The comma effectively becomes a barrier across which reordering is not allowed.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionACcreatorView Question on Stackoverflow
Solution 1 - C++David HeffernanView Answer on Stackoverflow
Solution 2 - C++Shafik YaghmourView Answer on Stackoverflow
Solution 3 - C++Mohit JainView Answer on Stackoverflow
Solution 4 - C++Dr. Debasish JanaView Answer on Stackoverflow
Solution 5 - C++JoshuaView Answer on Stackoverflow
Solution 6 - C++Matthew FaithfulllView Answer on Stackoverflow