Does a[a[0]] = 1 produce undefined behavior?

CLanguage LawyerC99Undefined Behavior

C Problem Overview


Does this C99 code produce undefined behavior?

#include <stdio.h>

int main() {
  int a[3] = {0, 0, 0};
  a[a[0]] = 1;
  printf("a[0] = %d\n", a[0]);
  return 0;
}

In the statement a[a[0]] = 1; , a[0] is both read and modified.

I looked n1124 draft of ISO/IEC 9899. It says (in 6.5 Expressions):

> Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.

It does not mention reading an object to determine the object itself to be modified. Thus this statement might produce undefined behavior.

However, I feel it strange. Does this actually produce undefined behavior?

(I also want to know about this problem in other ISO C versions.)

C Solutions


Solution 1 - C

> the prior value shall be read only to determine the value to be stored.

This is a bit vague and caused confusion, which is partly why C11 threw it out and introduced a new sequencing model.

What it is trying to say is that: if reading the old value is guaranteed to occur earlier in time than writing the new value, then that's fine. Otherwise it is UB. And of course it is a requirement that the new value be computed before it is written.

(Of course the description I have just written will be found by some to be more vague than the Standard text!)

For example x = x + 5 is correct because it is not possible to work out x + 5 without first knowing x. However a[i] = i++ is wrong because the read of i on the left hand side is not required in order to work out the new value to store in i. (The two reads of i are considered separately).


Back to your code now. I think it is well-defined behaviour because the read of a[0] in order to determine the array index is guaranteed to occur before the write.

We cannot write until we have determined where to write. And we do not know where to write until after we read a[0]. Therefore the read must come before the write, so there is no UB.

Someone commented about sequence points. In C99 there is no sequence point in this expression, so sequence points do not come into this discussion.

Solution 2 - C

>Does this C99 code produce undefined behavior?

No. It will not produce undefined behavior. a[0] is modified only once between two sequence points (first sequence point is at the end of initializer int a[3] = {0, 0, 0}; and second is after the full expression a[a[0]] = 1).

>It does not mention reading an object to determine the object itself to be modified. Thus this statement might produce undefined behavior.

An object can be read more than once to modify itself and its a perfectly defined behavior. Look at this example

int x = 10;
x = x*x + 2*x + x%5;   

Second statement of the quote says:

>Furthermore, the prior value shall be read only to determine the value to be stored.

All the x in the above expression is read to determine the value of object x itself.


NOTE: Note that there are two parts of the quote mentioned in the question. First part says: Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression., and
therefore the expression like

i = i++;

comes under UB (Two modifications between previous and next sequence points).

Second part says: Furthermore, the prior value shall be read only to determine the value to be stored., and therefore the expressions like

a[i++] = i;
j = (i = 2) + i;  

invoke UB. In both expressions i is modified only once between previous and next sequence points, but the reading of the rightmost i do not determine the value to be stored in i.


In C11 standard this has been changed to

6.5 Expressions:

>If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. [...]

In expression a[a[0]] = 1, there is only one side effect to a[0] and the value computation of index a[0] is sequenced before the value computation of a[a[0]].

Solution 3 - C

C99 presents an enumeration of all the sequence points in annex C. There is one at the end of

a[a[0]] = 1;

because it is a complete expression statement, but there are no sequence points inside. Although logic dictates that the subexpression a[0] must be evaluated first, and the result used to determine to which array element the value is assigned, the sequencing rules do not ensure it. When the initial value of a[0] is 0, a[0] is both read and written between two sequence points, and the read is not for the purpose of determining what value to write. Per C99 6.5/2, the behavior of evaluating the expression is therefore undefined, but in practice I don't think you need to worry about it.

C11 is better in this regard. Section 6.5, paragraph (1) says

> An expression is a sequence of operators and operands that specifies computation of a value, or that designates an object or a function, or that generates side effects, or that performs a combination thereof. The value computations of the operands of an operator are sequenced before the value computation of the result of the operator.

Note in particular the second sentence, which has no analogue in C99. You might think that would be sufficient, but it isn't. It applies to the value computations, but it says nothing about the sequencing of side effects relative to the value computations. Updating the value of the left operand is a side effect, so that extra sentence does not directly apply.

C11 nevertheless comes through for us on this one, as the specifications for the assignment operators provide the needed sequencing (C11 6.5.16(3)):

> [...] The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands. The evaluations of the operands are unsequenced.

(In contrast, C99 just says that updating the stored value of the left operand happens between the previous and next sequence points.) With sections 6.5 and 6.5.16 together, then, C11 gives a well-defined sequence: the inner [] is evaluated before the outer [], which is evaluated before the stored value is updated. This satisfies C11's version of 6.5(2), so in C11, the behavior of evaluating the expression is defined.

Solution 4 - C

The value is well defined, unless a[0] contains a value that is not a valid array index (i.e. in your code is not negative and does not exceed 3). You could change the code to the more readable and equivalent

 index = a[0];
 a[index] = 1;    /* still UB if index < 0 || index >= 3 */

In the expression a[a[0]] = 1 it is necessary to evaluate a[0] first. If a[0] happens to be zero, then a[0] will be modified. But there is no way for a compiler (short of not complying with the standard) to change order of evaluations and modify a[0] before attempting to read its value.

Solution 5 - C

A side effect includes modification of an object1.

The C standard says that behavior is undefined if a side effect on object is unsequenced with a side effect on the same object or a value computation using the value of the same object2.

The object a[0] in this expression is modified (side effect) and it's value (value computation) is used to determine the index. It would seem this expression yields undefined behavior:

a[a[0]] = 1

However the text in assignment operators in the standard, explains that the value computation of both left and right operands of the operator =, is sequenced before the left operand is modified3.

The behavior is thus defined, as the first rule1 isn't violated, because the modification (side effect) is sequenced after the value computation of the same object.


1 (Quoted from ISO/IEC 9899:201x 5.1.2.3 Program Exectution 2):
Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.

2 (Quoted from ISO/IEC 9899:201x 6.5 Expressions 2):
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

3 (Quoted from ISO/IEC 9899:201x 6.5.16 Assignment operators 3):
The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands. The evaluations of the operands are unsequenced.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMasaki HaraView Question on Stackoverflow
Solution 1 - CM.MView Answer on Stackoverflow
Solution 2 - ChaccksView Answer on Stackoverflow
Solution 3 - CJohn BollingerView Answer on Stackoverflow
Solution 4 - CPeterView Answer on Stackoverflow
Solution 5 - C2501View Answer on Stackoverflow