Violating of strict-aliasing in C, even without any casting?

CStrict Aliasing

C Problem Overview


How can *i and u.i print different numbers in this code, even though i is defined as int *i = &u.i;? I can only assuming that I'm triggering UB here, but I can't see how exactly.

(ideone demo replicates if I select 'C' as the language. But as @2501 pointed out, not if 'C99 strict' is the language. But then again, I get the problem with gcc-5.3.0 -std=c99!)

// gcc       -fstrict-aliasing -std=c99   -O2
union
{   
    int i;
    short s;
} u;

int     * i = &u.i;
short   * s = &u.s;

int main()
{   
    *i  = 2;
    *s  = 100;

    printf(" *i = %d\n",  *i); // prints 2
    printf("u.i = %d\n", u.i); // prints 100

    return 0;
}

(gcc 5.3.0, with -fstrict-aliasing -std=c99 -O2, also with -std=c11)

My theory is that 100 is the 'correct' answer, because the write to the union member through the short-lvalue *s is defined as such (for this platform/endianness/whatever). But I think that the optimizer doesn't realize that the write to *s can alias u.i, and therefore it thinks that *i=2; is the only line that can affect *i. Is this a reasonable theory?

If *s can alias u.i, and u.i can alias *i, then surely the compiler should think that *s can alias *i? Shouldn't aliasing be 'transitive'?

Finally, I always had this assumption that strict-aliasing problems were caused by bad casting. But there is no casting in this!

(My background is C++, I'm hoping I'm asking a reasonable question about C here. My (limited) understanding is that, in C99, it is acceptable to write through one union member and then reading through another member of a different type.)

C Solutions


Solution 1 - C

The disrepancy is issued by -fstrict-aliasing optimization option. Its behavior and possible traps are described in GCC documentation:

> Pay special attention to code like this: > > union a_union { > int i; > double d; > }; >
> int f() { > union a_union t; > t.d = 3.0; > return t.i; > } > > The practice of reading from a different union member than the one > most recently written to (called “type-punning”) is common. Even with > -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected. > See Structures unions enumerations and bit-fields implementation. However, this code might > not: > > int f() { > union a_union t; > int* ip; > t.d = 3.0; > ip = &t.i; > return *ip; > }

Note that conforming implementation is perfectly allowed to take advantage of this optimization, as second code example exhibits undefined behaviour. See Olaf's and others' answers for reference.

Solution 2 - C

C standard (i.e. C11, n1570), 6.5p7:

> An object shall have its stored value accessed only by an lvalue expression that has one of the following types: > > - ... > - an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or a character type.

The lvalue expressions of your pointers are not union types, thus this exception does not apply. The compiler is correct exploiting this undefined behaviour.

Make the pointers' types pointers to the union type and dereference with the respective member. That should work:

union {
    ...
} u, *i, *p;

Solution 3 - C

Strict aliasing is underspecified in the C Standard, but the usual interpretation is that union aliasing (which supersedes strict aliasing) is only permitted when the union members are directly accessed by name.

For rationale behind this consider:

void f(int *a, short *b) { 

The intent of the rule is that the compiler can assume a and b don't alias, and generate efficient code in f. But if the compiler had to allow for the fact that a and b might be overlapping union members, it actually couldn't make those assumptions.

Whether or not the two pointers are function parameters or not is immaterial, the strict aliasing rule doesn't differentiate based on that.

Solution 4 - C

This code indeed invokes UB, because you do not respect the strict aliasing rule. n1256 draft of C99 states in 6.5 Expressions §7:

> An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.

Between the *i = 2; and the printf(" *i = %d\n", *i); only a short object is modified. With the help of the strict aliasing rule, the compiler is free to assume that the int object pointed by i has not been changed, and it can directly use a cached value without reloading it from main memory.

It is blatantly not what a normal human being would expect, but the strict aliasing rule was precisely written to allow optimizing compilers to use cached values.

For the second print, unions are referenced in same standard in 6.2.6.1 Representations of types / General §7:

> When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values.

So as u.s has been stored, u.i have taken a value unspecified by standard

But we can read later in 6.5.2.3 Structure and union members §3 note 82:

> If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.

Although notes are not normative, they do allow better understanding of the standard. When u.s have been stored through the *s pointer, the bytes corresponding to a short have been changed to the 2 value. Assuming a little endian system, as 100 is smaller that the value of a short, the representation as an int should now be 2 as high order bytes were 0.

TL/DR: even if not normative, the note 82 should require that on a little endian system of the x86 or x64 families, printf("u.i = %d\n", u.i); prints 2. But per the strict aliasing rule, the compiler is still allowed to assumed that the value pointed by i has not changed and may print 100

Solution 5 - C

You are probing a somewhat controversial area of the C standard.

This is the strict aliasing rule:

> An object shall have its stored value accessed only by an lvalue > expression that has one of the following types: > > - a type compatible with the effective type of the object, > - a qualified version of a type compatible with the effective type of the object, > - a type that is the signed or unsigned type corresponding to > the effective type of the object, > - a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object, > - an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), > - a character type.

(C2011, 6.5/7)

The lvalue expression *i has type int. The lvalue expression *s has type short. These types are not compatible with each other, nor both compatible with any other particular type, nor does the strict aliasing rule afford any other alternative that allows both accesses to conform if the pointers are aliased.

If at least one of the accesses is non-conforming then the behavior is undefined, so the result you report -- or indeed any other result at all -- is entirely acceptable. In practice, the compiler must produce code that reorders the assignments with the printf() calls, or that uses a previously loaded value of *i from a register instead of re-reading it from memory, or some similar thing.

The aforementioned controversy arises because people will sometimes point to footnote 95:

> If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.

Footnotes are informational, however, not normative, so there's really no question which text wins if they conflict. Personally, I take the footnote simply as an implementation guidance, clarifying the meaning of the fact that the storage for union members overlaps.

Solution 6 - C

Looks like this is a result of the optimizer doing its magic.

With -O0, both lines print 100 as expected (assuming little-endian). With -O2, there is some reordering going on.

gdb gives the following output:

(gdb) start
Temporary breakpoint 1 at 0x4004a0: file /tmp/x1.c, line 14.
Starting program: /tmp/x1
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaaab000

Temporary breakpoint 1, main () at /tmp/x1.c:14
14      {
(gdb) step
15          *i  = 2;
(gdb)
18          printf(" *i = %d\n",  *i); // prints 2
(gdb)
15          *i  = 2;
(gdb)
16          *s  = 100;
(gdb)
18          printf(" *i = %d\n",  *i); // prints 2
(gdb)
 *i = 2
19          printf("u.i = %d\n", u.i); // prints 100
(gdb)
u.i = 100
22      }
(gdb)
0x0000003fa441d9f4 in __libc_start_main () from /lib64/libc.so.6
(gdb)

The reason this happens, as others have stated, is because it is undefined behavior to access a variable of one type through a pointer to another type even if the variable in question is part of a union. So the optimizer is free to do as it wishes in this case.

The variable of the other type can only be read directly via a union which guarantees well defined behavior.

What's curious is that even with -Wstrict-aliasing=2, gcc (as of 4.8.4) doesn't complain about this code.

Solution 7 - C

Whether by accident or by design, C89 includes language which has been interpreted in two different ways (along with various interpretations in-between). At issue is the question of when a compiler should be required to recognize that storage used for one type might be accessed via pointers of another. In the example given in the C89 rationale, aliasing is considered between a global variable which is clearly not part of any union and a pointer to a different type, and nothing in the code would suggest that aliasing could occur.

One interpretation horribly cripples the language, while the other would restrict the use of certain optimizations to "non-conforming" modes. If those who didn't to have their preferred optimizations given second-class status had written C89 to unambiguously match their interpretation, those parts of the Standard would have been widely denounced and there would have been some sort of clear recognition of a non-broken dialect of C which would honor the non-crippling interpretation of the given rules.

Unfortunately, what has happened instead is since the rules clearly don't require compiler writers apply a crippling interpretation, most compiler writers have for years simply interpreted the rules in a fashion which retains the semantics that made C useful for systems programming; programmers didn't have any reason to complain that the Standard didn't mandate that compilers behave sensibly because from their perspective it seemed obvious to everyone that they should do so despite the sloppiness of the Standard. Meanwhile, however, some people insist that since the Standard has always allowed compilers to process a semantically-weakened subset of Ritchie's systems-programming language, there's no reason why a standard-conforming compiler should be expected to process anything else.

The sensible resolution for this issue would be to recognize that C is used for sufficiently varied purposes that there should be multiple compilation modes--one required mode would treat all accesses of everything whose address was taken as though they read and write the underlying storage directly, and would be compatible with code which expects any level of pointer-based type punning support. Another mode could be more restrictive than C11 except when code explicitly uses directives to indicate when and where storage that has been used as one type would need to be reinterpreted or recycled for use as another. Other modes would allow some optimizations but support some code that would break under stricter dialects; compilers without specific support for a particular dialect could substitute one with more defined aliasing behaviors.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAaron McDaidView Question on Stackoverflow
Solution 1 - CGrzegorz SzpetkowskiView Answer on Stackoverflow
Solution 2 - Ctoo honest for this siteView Answer on Stackoverflow
Solution 3 - CM.MView Answer on Stackoverflow
Solution 4 - CSerge BallestaView Answer on Stackoverflow
Solution 5 - CJohn BollingerView Answer on Stackoverflow
Solution 6 - CdbushView Answer on Stackoverflow
Solution 7 - CsupercatView Answer on Stackoverflow