How do I make an infinite empty loop that won't be optimized away?

CClangLanguage LawyerCompiler Optimization

C Problem Overview


The C11 standard appears to imply that iteration statements with constant controlling expressions should not be optimized out. I'm taking my advice from this answer, which specifically quotes section 6.8.5 from the draft standard:

> An iteration statement whose controlling expression is not a constant expression ... may be assumed by the implementation to terminate.

In that answer it mentions that a loop like while(1) ; should not be subject to optimization.

So...why does Clang/LLVM optimize out the loop below (compiled with cc -O2 -std=c11 test.c -o test)?

#include <stdio.h>

static void die() {
    while(1)
        ;
}

int main() {
    printf("begin\n");
    die();
    printf("unreachable\n");
}

On my machine, this prints out begin, then crashes on an illegal instruction (a ud2 trap placed after die()). On godbolt, we can see that nothing is generated after the call to puts.

It's been a surprisingly difficult task to get Clang to output an infinite loop under -O2 - while I could repeatedly test a volatile variable, that involves a memory read that I don't want. And if I do something like this:

#include <stdio.h>

static void die() {
    while(1)
        ;
}

int main() {
    printf("begin\n");
    volatile int x = 1;
    if(x)
        die();
    printf("unreachable\n");
}

...Clang prints begin followed by unreachable as if the infinite loop never existed.

How do you get Clang to output a proper, no-memory-access infinite loop with optimizations turned on?

C Solutions


Solution 1 - C

The C11 standard says this, 6.8.5/6:

> An iteration statement whose controlling expression is not a constant expression,156) that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.157)

The two foot notes are not normative but provide useful information:

> 156) An omitted controlling expression is replaced by a nonzero constant, which is a constant expression.

> 157) This is intended to allow compiler transformations such as removal of empty loops even when termination cannot be proven.

In your case, while(1) is a crystal clear constant expression, so it may not be assumed by the implementation to terminate. Such an implementation would be hopelessly broken, since "for-ever" loops is a common programming construct.

What happens to the "unreachable code" after the loop is however, as far as I know, not well-defined. However, clang does indeed behave very strange. Comparing the machine code with gcc (x86):

gcc 9.2 -O3 -std=c11 -pedantic-errors

.LC0:
        .string "begin"
main:
        sub     rsp, 8
        mov     edi, OFFSET FLAT:.LC0
        call    puts
.L2:
        jmp     .L2

clang 9.0.0 -O3 -std=c11 -pedantic-errors

main:                                   # @main
        push    rax
        mov     edi, offset .Lstr
        call    puts
.Lstr:
        .asciz  "begin"

gcc generates the loop, clang just runs into the woods and exits with error 255.

I'm leaning towards this being non-compliant behavior of clang. Because I tried to expand your example further like this:

#include <stdio.h>
#include <setjmp.h>

static _Noreturn void die() {
    while(1)
        ;
}

int main(void) {
    jmp_buf buf;
    _Bool first = !setjmp(buf);

    printf("begin\n");
    if(first)
    {
      die();
      longjmp(buf, 1);
    }
    printf("unreachable\n");
}

I added C11 _Noreturn in an attempt to help the compiler further along. It should be clear that this function will hang up, from that keyword alone.

setjmp will return 0 upon first execution, so this program should just smash into the while(1) and stop there, only printing "begin" (assuming \n flushes stdout). This happens with gcc.

If the loop was simply removed, it should print "begin" 2 times then print "unreachable". On clang however (godbolt), it prints "begin" 1 time and then "unreachable" before returning exit code 0. That's just plain wrong no matter how you put it.

I can find no case for claiming undefined behavior here, so my take is that this is a bug in clang. At any rate, this behavior makes clang 100% useless for programs like embedded systems, where you simply must be able to rely on eternal loops hanging the program (while waiting for a watchdog etc).

Solution 2 - C

You need to insert an expression that may cause a side-effect.

The simplest solution:

static void die() {
    while(1)
       __asm("");
}

Godbolt link

Solution 3 - C

Other answers already covered ways to make Clang emit the infinite loop, with inline assembly language or other side effects. I just want to confirm that this was indeed a compiler bug. Specifically, it was a long-standing LLVM bug - it applied the C++ concept of "all loops without side-effects must terminate" to languages where it shouldn't, such as C. The bug was finally fixed in LLVM 12.

For example, the Rust programming language also allows infinite loops and uses LLVM as a backend, and it had this same issue.

LLVM 12 added a mustprogress attribute that frontends can omit to indicate when functions don't necessarily return, and clang 12 was updated to account for it. You can see that your example compiles correctly with clang 12.0.0 whereas it did not with clang 11.0.1

Solution 4 - C

This is a Clang bug

... when inlining a function containing an infinite loop. The behaviour is different when while(1); appears directly in main, which smells very buggy to me.

See @Arnavion's answer for a summary and links. The rest of this answer was written before I had confirmation that it was a bug, let alone a known bug.


To answer the title question: How do I make an infinite empty loop that won't be optimized away?? -
make die() a macro, not a function, to work around this bug in Clang 3.9 and later. (Earlier Clang versions either keeps the loop or emits a call to a non-inline version of the function with the infinite loop.) That appears to be safe even if the print;while(1);print; function inlines into its caller (Godbolt). -std=gnu11 vs. -std=gnu99 doesn't change anything.

If you only care about GNU C, P__J__'s __asm__(""); inside the loop also works, and shouldn't hurt optimization of any surrounding code for any compilers that understand it. GNU C Basic asm statements are implicitly volatile, so this counts as a visible side-effect that has to "execute" as many times as it would in the C abstract machine. (And yes, Clang implements the GNU dialect of C, as documented by the GCC manual.)


Some people have argued that it might be legal to optimize away an empty infinite loop. I don't agree1, but even if we accept that, it can't also be legal for Clang to assume statements after the loop are unreachable, and let execution fall off the end of the function into the next function, or into garbage that decodes as random instructions.

(That would be standards-compliant for Clang++ (but still not very useful); infinite loops without any side effects are UB in C++, but not C.
https://stackoverflow.com/questions/16436237/is-while1-undefined-behavior-in-c UB lets the compiler emit basically anything for code on a path of execution that will definitely encounter UB. An asm statement in the loop would avoid this UB for C++. But in practice, Clang compiling as C++ doesn't remove constant-expression infinite empty loops except when inlining, same as when compiling as C.)


Manually inlining while(1); changes how Clang compiles it: infinite loop present in asm. This is what we'd expect from a rules-lawyer POV.

#include <stdio.h>
int main() {
    printf("begin\n");
    while(1);
    //infloop_nonconst(1);
    //infloop();
    printf("unreachable\n");
}

On the Godbolt compiler explorer, Clang 9.0 -O3 compiling as C (-xc) for x86-64:

main:                                   # @main
        push    rax                       # re-align the stack by 16
        mov     edi, offset .Lstr         # non-PIE executable can use 32-bit absolute addresses
        call    puts
.LBB3_1:                                # =>This Inner Loop Header: Depth=1
        jmp     .LBB3_1                   # infinite loop


.section .rodata
 ...
.Lstr:
        .asciz  "begin"

The same compiler with the same options compiles a main that calls infloop() { while(1); } to the same first puts, but then just stops emitting instructions for main after that point. So as I said, execution just falls off the end of the function, into whatever function is next (but with the stack misaligned for function entry so it's not even a valid tailcall).

The valid options would be to

  • emit a label: jmp label infinite loop
  • or (if we accept that the infinite loop can be removed) emit another call to print the 2nd string, and then return 0 from main.

Crashing or otherwise continuing without printing "unreachable" is clearly not ok for a C11 implementation, unless there's UB that I haven't noticed.


Footnote 1:

For the record, I agree with @Lundin's answer which cites the standard for evidence that C11 doesn't allow assumption of termination for constant-expression infinite loops, even when they're empty (no I/O, volatile, synchronization, or other visible side-effects).

This is the set of conditions that would let a loop be compiled to an empty asm loop for a normal CPU. (Even if the body wasn't empty in the source, assignments to variables can't be visible to other threads or signal handlers without data-race UB while the loop is running. So a conforming implementation could remove such loop bodies if it wanted to. Then that leaves the question of whether the loop itself can be removed. ISO C11 explicitly says no.)

Given that C11 singles out that case as one where the implementation can't assume the loop terminates (and that it's not UB), it seems clear they intend the loop to be present at run-time. An implementation that targets CPUs with an execution model that can't do an infinite amount of work in finite time has no justification for removing an empty constant infinite loop. Or even in general, the exact wording is about whether they can be "assumed to terminate" or not. If a loop can't terminate, that means later code is not reachable, no matter what arguments you make about math and infinities and how long it takes to do an infinite amount of work on some hypothetical machine.

Further to that, Clang isn't merely an ISO C compliant DeathStation 9000, it's intended to be useful for real-world low-level systems programming, including kernels and embedded stuff. So whether or not you accept arguments about C11 allowing removal of while(1);, it doesn't make sense that Clang would want to actually do that. If you write while(1);, that probably wasn't an accident. Removal of loops that end up infinite by accident (with runtime variable control expressions) can be useful, and it makes sense for compilers to do that.

It's rare that you want to just spin until the next interrupt, but if you write that in C that's definitely what you expect to happen. (And what does happen in GCC and Clang, except for Clang when the infinite loop is inside a wrapper function).

For example, in a primitive OS kernel, when the scheduler has no tasks to run it might run the idle task. A first implementation of that might be while(1);.

Or for hardware without any power-saving idle feature, that might be the only implementation. (Until the early 2000s, that was I think not rare on x86. Although the hlt instruction did exist, IDK if it saved a meaningful amount of power until CPUs started having low-power idle states.)

Solution 5 - C

Just for the record, Clang also misbehaves with goto:

static void die() {
nasty:
    goto nasty;
}

int main() {
    int x; printf("begin\n");
    die();
    printf("unreachable\n");
}

It produces the same output as in the question, i.e.:

main: # @main
  push rax
  mov edi, offset .Lstr
  call puts
.Lstr:
  .asciz "begin"

I see don't see any way to read this as permitted in C11, which only says:

> 6.8.6.1(2) A goto statement causes an unconditional jump to the statement prefixed by the named label in the enclosing function.

As goto is not an "iteration statement" (6.8.5 lists while, do and for) nothing about the special "termination-assumed" indulgences apply, however you want to read them.

Per original question's Godbolt link compiler is x86-64 Clang 9.0.0 and flags are -g -o output.s -mllvm --x86-asm-syntax=intel -S --gcc-toolchain=/opt/compiler-explorer/gcc-9.2.0 -fcolor-diagnostics -fno-crash-diagnostics -O2 -std=c11 example.c

With others such as x86-64 GCC 9.2 you get the pretty well perfect:

.LC0:
  .string "begin"
main:
  sub rsp, 8
  mov edi, OFFSET FLAT:.LC0
  call puts
.L2:
  jmp .L2

Flags: -g -o output.s -masm=intel -S -fdiagnostics-color=always -O2 -std=c11 example.c

Solution 6 - C

I'll play the devil's advocate and argue that the standard does not explicitly forbid a compiler from optimizing out an infinite loop.

> An iteration statement whose controlling expression is not a constant > expression,156) that performs no input/output operations, does not > access volatile objects, and performs no synchronization or atomic > operations in its body, controlling expression, or (in the case of a > for statement) its expression-3, may be assumed by the implementation > to terminate.157)

Let's parse this. An iteration statement that satisfies certain criteria may be assumed to terminate:

if (satisfiesCriteriaForTerminatingEh(a_loop)) 
    if (whatever_reason_or_just_because_you_feel_like_it)
         assumeTerminates(a_loop);

This doesn't say anything about what happens if the criteria aren't satisfied and assuming that a loop may terminate even then isn't explicitly forbidden as long as other rules of the standard are observed.

do { } while(0) or while(0){} are after all iteration statements (loops) that don't satisfy the criteria that allow a compiler to just assume on a whim that they terminate and yet they obviously do terminate.

But can the compiler just optimize while(1){} out?

5.1.2.3p4 says:

> In the abstract machine, all expressions are evaluated as specified by > the semantics. An actual implementation need not evaluate part of an > expression if it can deduce that its value is not used and that no > needed side effects are produced (including any caused by calling a > function or accessing a volatile object).

This mentions expressions, not statements, so it's not 100% convincing, but it certainly allows calls like:

void loop(void){ loop(); }

int main()
{
    loop();
}

to be skipped. Interestingly, clang does skip it, and gcc doesn't.

Solution 7 - C

I have been convinced this is just a plain old bug. I leave the my tests below and in particular the reference to the discussion in the standard committee for some reasoning I previously had.


I think this is undefined behavior (see end), and Clang just has one implementation. GCC indeed works as you expect, optimizing out only the unreachable print statement but leaving the loop. Some how Clang is oddly making decisions when combining the in-lining and determining what it can do with the loop.

The behavior is extra weird - it removes the final print, so "seeing" the infinite loop, but then getting rid of the loop as well.

It's even worse as far as I can tell. Removing the inline we get:

die: # @die
.LBB0_1: # =>This Inner Loop Header: Depth=1
  jmp .LBB0_1
main: # @main
  push rax
  mov edi, offset .Lstr
  call puts
.Lstr:
  .asciz "begin"

so the function is created, and the call optimized out. This is even more resilient than expected:

#include <stdio.h>

void die(int x) {
    while(x);
}

int main() {
    printf("begin\n");
    die(1);
    printf("unreachable\n");
}

results in a very non-optimal assembly for the function, but the function call is again optimized out! Even worse:

void die(x) {
    while(x++);
}

int main() {
    printf("begin\n");
    die(1);
    printf("unreachable\n");
}

I made a bunch of other test with adding a local variable and increasing it, passing a pointer, using a goto etc... At this point I would give up. If you must use clang

static void die() {
    int volatile x = 1;
    while(x);
}

does the job. It sucks at optimizing (obviously), and leaves in the redundant final printf. At least the program does not halt. Maybe GCC after all?

Addendum

Following discussion with David, I yield that the standard does not say "if the condition is constant, you may not assume the loop terminates". As such, and granted under the standard there is no observable behavior (as defined in the standard), I would argue only for consistency - if a compiler is optimizing out a loop because it assume it terminates, it should not optimize out following statements.

Heck n1528 has these as undefined behavior if I read that right. Specifically

> A major issue for doing so is that it allows code to move across a potentially non-terminating loop

From here I think it can only devolve into a discussion of what we want (expected?) rather than what is allowed.

Solution 8 - C

It seems that this is a bug in the Clang compiler. If there isn't any compulsion on the die() function to be a static function, do away with static and make it inline:

#include <stdio.h>

inline void die(void) {
    while(1)
        ;
}

int main(void) {
    printf("begin\n");
    die();
    printf("unreachable\n");
}

It's working as expected when compiled with the Clang compiler and is portable as well.

Compiler Explorer (godbolt.org) - clang 9.0.0 -O3 -std=c11 -pedantic-errors

main:                                   # @main
        push    rax
        mov     edi, offset .Lstr
        call    puts
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        jmp     .LBB0_1
.Lstr:
        .asciz  "begin"

Solution 9 - C

The following appears to work for me:

#include <stdio.h>

__attribute__ ((optnone))
static void die(void) {
    while (1) ;
}

int main(void) {
    printf("begin\n");
    die();
    printf("unreachable\n");
}

at godbolt

Explicitly telling Clang not to optimize that one function causes an infinite loop to be emitted as expected. Hopefully there's a way to selectively disable particular optimizations instead of just turning them all off like that. Clang still refuses to emit code for the second printf, though. To force it to do that, I had to further modify the code inside main to:

volatile int x = 0;
if (x == 0)
    die();

It looks like you'll need to disable optimizations for your infinite loop function, then ensure that your infinite loop is called conditionally. In the real world, the latter is almost always the case anyway.

Solution 10 - C

A conforming implementation may, and many practical ones do, impose arbitrary limits on how long a program may execute or how many instructions it would execute, and behave in arbitrary fashion if those limits are violated or--under the "as-if" rule--if it determines that they will inevitably be violated. Provided that an implementation can successfully process at least one program that nominally exercises all the limits listed in N1570 5.2.4.1 without hitting any translation limits, the existence of limits, the extent to which they are documented, and the effects of exceeding them, are all Quality of Implementation issues outside the jurisdiction of the Standard.

I think the intention of the Standard is quite clear that compilers shouldn't assume that a while(1) {} loop with no side-effects nor break statements will terminate. Contrary to what some people might think, the authors of the Standard were not inviting compiler writers to be stupid or obtuse. A conforming implementation might usefully to decide to terminate any program which would, if not interrupted, execute more side-effect free instructions than there are atoms in the universe, but a quality implementation shouldn't perform such action on the basis of any assumption about termination but rather on the basis that doing so could be useful, and wouldn't (unlike clang's behavior) be worse than useless.

Solution 11 - C

The loop has no side-effects, and so can be optimized out. The loop is effectively an infinite number of iterations of zero units of work. This is undefined in mathematics and in logic and the standard doesn't say whether an implementation is permitted to complete an infinite number of things if each thing can be done in zero time. Clang's interpretation is perfectly reasonable in treating infinity times zero as zero rather than infinity. The standard doesn't say whether or not an infinite loop can end if all the work in the loops is in fact completed.

The compiler is permitted to optimize out anything that's not observable behavior as defined in the standard. That includes execution time. It is not required to preserve the fact that the loop, if not optimized, would take an infinite amount of time. It is permitted to change that to a much shorter run time -- in fact, that the point of most optimizations. Your loop was optimized.

Even if clang translated the code naively, you could imagine an optimizing CPU that can complete each iteration in half the time the previous iteration took. That would literally complete the infinite loop in a finite amount of time. Does such an optimizing CPU violate the standard? It seems quite absurd to say that an optimizing CPU would violate the standard if it's too good at optimizing. The same is true of a compiler.

Solution 12 - C

I'm sorry if this is absurdly not the case, I stumbled upon this post and I know because my years using Gentoo Linux distro that if you want the compiler to not optimize your code you should use -O0(Zero). I was curious about it, and compiled and ran the above code, and the loop do goes indefinitely. Compiled using clang-9:

cc -O0 -std=c11 test.c -o test

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionnneonneoView Question on Stackoverflow
Solution 1 - CLundinView Answer on Stackoverflow
Solution 2 - C0___________View Answer on Stackoverflow
Solution 3 - CArnavionView Answer on Stackoverflow
Solution 4 - CPeter CordesView Answer on Stackoverflow
Solution 5 - CjonathanjoView Answer on Stackoverflow
Solution 6 - CPSkocikView Answer on Stackoverflow
Solution 7 - CkabanusView Answer on Stackoverflow
Solution 8 - CH.S.View Answer on Stackoverflow
Solution 9 - CbtaView Answer on Stackoverflow
Solution 10 - CsupercatView Answer on Stackoverflow
Solution 11 - CDavid SchwartzView Answer on Stackoverflow
Solution 12 - CFellipe WenoView Answer on Stackoverflow