Why is -(-2147483648) = - 2147483648 in a 32-bit machine?

C32 BitTwos Complement

C Problem Overview


I think the question is self explanatory, I guess it probably has something to do with overflow but still I do not quite get it. What is happening, bitwise, under the hood?

Why does -(-2147483648) = -2147483648 (at least while compiling in C)?

C Solutions


Solution 1 - C

Negating an (unsuffixed) integer constant:

The expression -(-2147483648) is perfectly defined in C, however it may be not obvious why it is this way.

When you write -2147483648, it is formed as unary minus operator applied to integer constant. If 2147483648 can't be expressed as int, then it s is represented as long or long long* (whichever fits first), where the latter type is guaranteed by the C Standard to cover that value.

To confirm that, you could examine it by:

printf("%zu\n", sizeof(-2147483648));

which yields 8 on my machine.

The next step is to apply second - operator, in which case the final value is 2147483648L (assuming that it was eventually represented as long). If you try to assign it to int object, as follows:

int n = -(-2147483648);

then the actual behavior is implementation-defined. Referring to the Standard:

> ### C11 §6.3.1.3/3 Signed and unsigned integers ### > > Otherwise, the new type is signed and the value cannot be represented > in it; either the result is implementation-defined or an > implementation-defined signal is raised.

The most common way is to simply cut-off the higher bits. For instance, GCC documents it as:

> For conversion to a type of width N, the value is reduced modulo 2^N > to be within range of the type; no signal is raised.

Conceptually, the conversion to type of width 32 can be illustrated by bitwise AND operation:

value & (2^32 - 1) // preserve 32 least significant bits

In accordance with two's complement arithmetic, the value of n is formed with all zeros and MSB (sign) bit set, which represents value of -2^31, that is -2147483648.

Negating an int object:

If you try to negate int object, that holds value of -2147483648, then assuming two's complement machine, the program will exhibit undefined behavior:

n = -n; // UB if n == INT_MIN and INT_MAX == 2147483647

> ### C11 §6.5/5 Expressions### > > If an exceptional condition occurs during the evaluation of an > expression (that is, if the result is not mathematically defined or > not in the range of representable values for its type), the behavior > is undefined.

Additional references:

*) In withdrawed C90 Standard, there was no long long type and the rules were different. Specifically, sequence for unsuffixed decimal was int, long int, unsigned long int (C90 §6.1.3.2 Integer constants).

†) This is due to LLONG_MAX, which must be at least +9223372036854775807 (C11 §5.2.4.2.1/1).

Solution 2 - C

Note: this answer does not apply as such on the obsolete ISO C90 standard that is still used by many compilers

First of all, on C99, C11, the expression -(-2147483648) == -2147483648 is in fact false:

int is_it_true = (-(-2147483648) == -2147483648);
printf("%d\n", is_it_true);

prints

0

So how it is possible that this evaluates to true? The machine is using 32-bit two's complement integers. The 2147483648 is an integer constant that quite doesn't fit in 32 bits, thus it will be either long int or long long int depending on whichever is the first where it fits. This negated will result in -2147483648 - and again, even though the number -2147483648 can fit in a 32-bit integer, the expression -2147483648 consists of a >32-bit positive integer preceded with unary -!

You can try the following program:

#include <stdio.h>

int main() {
    printf("%zu\n", sizeof(2147483647));
    printf("%zu\n", sizeof(2147483648));
    printf("%zu\n", sizeof(-2147483648));
}

The output on such machine most probably would be 4, 8 and 8.

Now, -2147483648 negated will again result in +214783648, which is still of type long int or long long int, and everything is fine.

In C99, C11, the integer constant expression -(-2147483648) is well-defined on all conforming implementations.


Now, when this value is assigned to a variable of type int, with 32 bits and two's complement representation, the value is not representable in it - the values on 32-bit 2's complement would range from -2147483648 to 2147483647.

The C11 standard 6.3.1.3p3 says the following of integer conversions:

> * [When] the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

That is, the C standard doesn't actually define what the value in this case would be, or doesn't preclude the possibility that the execution of the program stops due to a signal being raised, but leaves it to the implementations (i.e. compilers) to decide how to handle it (C11 3.4.1):

> implementation-defined behavior > > unspecified behavior where each implementation documents how the choice is made

and (3.19.1):

> implementation-defined value > > unspecified value where each implementation documents how the choice is made


In your case, the implementation-defined behaviour is that the value is the 32 lowest-order bits [*]. Due to the 2's complement, the (long) long int value 0x80000000 has the bit 31 set and all other bits cleared. In 32-bit two's complement integers the bit 31 is the sign bit - meaning that the number is negative; all value bits zeroed means that the value is the minimum representable number, i.e. INT_MIN.


[*] GCC documents its implementation-defined behaviour in this case as follows:

> The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3). > > For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.

Solution 3 - C

This is not a C question, for on a C implementation featuring 32-bit two's complement representation for type int, the effect of applying the unary negation operator to an int having the value -2147483648 is undefined. That is, the C language specifically disavows designating the result of evaluating such an operation.

Consider more generally, however, how the unary - operator is defined in two's complement arithmetic: the inverse of a positive number x is formed by flipping all the bits of its binary representation and adding 1. This same definition serves as well for any negative number that has at least one bit other than its sign bit set.

Minor problems arise, however, for the two numbers that have no value bits set: 0, which has no bits set at all, and the number that has only its sign bit set (-2147483648 in 32-bit representation). When you flip all the bits of either of these, you end up with all value bits set. Therefore, when you subsequently add 1, the result overflows the value bits. If you imagine performing the addition as if the number were unsigned, treating the sign bit as a value bit, then you get

    -2147483648 (decimal representation)
-->  0x80000000 (convert to hex)
-->  0x7fffffff (flip bits)
-->  0x80000000 (add one)
--> -2147483648 (convert to decimal)

Similar applies to inverting zero, but in that case the overflow upon adding 1 overflows the erstwhile sign bit, too. If the overflow is ignored, the resulting 32 low-order bits are all zero, hence -0 == 0.

Solution 4 - C

I'm gonna use a 4-bit number, just to make maths simple, but the idea is the same.

In a 4-bit number, the possible values are between 0000 and 1111. That would be 0 to 15, but if you wanna represent negative numbers, the first bit is used to indicate the sign (0 for positive and 1 for negative).

So 1111 is not 15. As the first bit is 1, it's a negative number. To know its value, we use the two-complement method as already described in previous answers: "invert the bits and add 1":

  • inverting the bits: 0000
  • adding 1: 0001

0001 in binary is 1 in decimal, so 1111 is -1.

The two-complement method goes both ways, so if you use it with any number, it will give you the binary representation of that number with the inverted sign.

Now let's see 1000. The first bit is 1, so it's a negative number. Using the two-complement method:

  • invert the bits : 0111
  • add 1: 1000 (8 in decimal)

So 1000 is -8. If we do -(-8), in binary it means -(1000), which actually means using the two-complement method in 1000. As we saw above, the result is also 1000. So, in a 4-bit number, -(-8) is equals -8.

In a 32-bit number, -2147483648 in binary is 1000..(31 zeroes), but if you use the two-complement method, you'll end up with the same value (the result is the same number).

That's why in 32-bit number -(-2147483648) is equals -2147483648

Solution 5 - C

It depends on the version of C, the specifics of the implementation and whether we are talking about variables or literals values.

The first thing to understand is that there are no negative integer literals in C "-2147483648" is a unary minus operation followed by a positive integer literal.

Lets assume that we are running on a typical 32-bit platform where int and long are both 32 bits and long long is 64 bits and consider the expression.

(-(-2147483648) == -2147483648 )

The compiler needs to find a type that can hold 2147483648, on a comforming C99 compiler it will use type "long long" but a C90 compiler can use type "unsigned long".

If the compiler uses type long long then nothing overflows and the comparision is false. If the compiler uses unsigned long then the unsigned wraparound rules come into play and the comparision is true.

Solution 6 - C

For the same reason that winding a tape deck counter 500 steps forward from 000 (through 001 002 003 ...) will show 500, and winding it backward 500 steps backward from 000 (through 999 998 997 ...) will also show 500.

This is two's complement notation. Of course, since 2's complement sign convention is to consider the topmost bit the sign bit, the result overflows the representable range, just like 2000000000+2000000000 overflows the representable range.

As a result, the processor's "overflow" bit will be set (seeing this requires access to the machine's arithmetic flags, generally not the case in most programming languages outside of assembler). This is the only value which will set the "overflow" bit when negating a 2's complement number: any other value's negation lies in the range representable by 2's complement.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLesscomfortableView Question on Stackoverflow
Solution 1 - CGrzegorz SzpetkowskiView Answer on Stackoverflow
Solution 2 - CAntti Haapala -- Слава УкраїніView Answer on Stackoverflow
Solution 3 - CJohn BollingerView Answer on Stackoverflow
Solution 4 - Cuser7605325View Answer on Stackoverflow
Solution 5 - CplugwashView Answer on Stackoverflow
Solution 6 - Cuser7624926View Answer on Stackoverflow