The difference between asm, asm volatile and clobbering memory

C Problem Overview

When implementing lock-free data structures and timing code it's often necessary to suppress the compiler's optimisations. Normally people do this using asm volatile with memory in the clobber list, but you sometimes see just asm volatile or just a plain asm clobbering memory.

What impact do these different statements have on code generation (particularly in GCC, as it's unlikely to be portable)?

Just for reference, these are the interesting variations:

asm ("");   // presumably this has no effect on code generation
asm volatile ("");
asm ("" ::: "memory");
asm volatile ("" ::: "memory");

C Solutions

Solution 1 - C

See the "Extended Asm" page in the GCC documentation.

> You can prevent an asm instruction from being deleted by writing the keyword volatile after the asm. [...] The volatile keyword indicates that the instruction has important side-effects. GCC will not delete a volatile asm if it is reachable.

and

> An asm instruction without any output operands will be treated identically to a volatile asm instruction.

None of your examples have output operands specified, so the asm and asm volatile forms behave identically: they create a point in the code which may not be deleted (unless it is proved to be unreachable).

This is not quite the same as doing nothing. See this question for an example of a dummy asm which changes code generation - in that example, code that goes round a loop 1000 times gets vectorised into code which calculates 16 iterations of the loop at once; but the presence of an asm inside the loop inhibits the optimisation (the asm must be reached 1000 times).

The "memory" clobber makes GCC assume that any memory may be arbitrarily read or written by the asm block, so will prevent the compiler from reordering loads or stores across it:

> This will cause GCC to not keep memory values cached in registers across the assembler instruction and not optimize stores or loads to that memory.

(That does not prevent a CPU from reordering loads and stores with respect to another CPU, though; you need real memory barrier instructions for that.)

Solution 2 - C

asm ("") does nothing (or at least, it's not supposed to do anything.

asm volatile ("") also does nothing.

asm ("" ::: "memory") is a simple compiler fence.

asm volatile ("" ::: "memory") AFAIK is the same as the previous. The volatile keyword tells the compiler that it's not allowed to move this assembly block. For example, it may be hoisted out of a loop if the compiler decides that the input values are the same in every invocation. I'm not really sure under what conditions the compiler will decide that it understands enough about the assembly to try to optimize its placement, but the volatile keyword suppresses that entirely. That said, I would be very surprised if the compiler attempted to move an asm statement that had no declared inputs or outputs.

Incidentally, volatile also prevents the compiler from deleting the expression if it decides that the output values are unused. This can only happen if there are output values though, so it doesn't apply to asm ("" ::: "memory").

Solution 3 - C

Just for completeness on Lily Ballard's answer, Visual Studio 2010 offers _ReadBarrier(), _WriteBarrier() and _ReadWriteBarrier() to do the same (VS2010 doesn't allow inline assembly for 64-bit apps).

These don't generate any instructions but affect the behaviour of the compiler. A nice example is here.

MemoryBarrier() generates lock or DWORD PTR [rsp], 0

Content Type	Original Author	Original Content on Stackoverflow
Question	jleahy	View Question on Stackoverflow
Solution 1 - C	Matthew Slattery	View Answer on Stackoverflow
Solution 2 - C	Lily Ballard	View Answer on Stackoverflow
Solution 3 - C	James	View Answer on Stackoverflow

The difference between asm, asm volatile and clobbering memory

C Problem Overview

C Solutions

Solution 1 - C

Solution 2 - C

Solution 3 - C

How to close a spring ApplicationContext?

How to set self.maxDiff in nose to get full diff output?

Attributions