Difference between if (a - b < 0) and if (a < b)
JavaIf StatementArraylistJava Problem Overview
I was reading Java's ArrayList
source code and noticed some comparisons in if-statements.
In Java 7, the method grow(int)
uses
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
In Java 6, grow
didn't exist. The method ensureCapacity(int)
however uses
if (newCapacity < minCapacity)
newCapacity = minCapacity;
What was the reason behind the change? Was it a performance issue or just a style?
I could imagine that comparing against zero is faster, but performing a complete subtraction just to check whether it's negative seems a bit overkill to me. Also in terms of bytecode, this would involve two instructions (ISUB
and IF_ICMPGE
) instead of one (IFGE
).
Java Solutions
Solution 1 - Java
a < b
and a - b < 0
can mean two different things. Consider the following code:
int a = Integer.MAX_VALUE;
int b = Integer.MIN_VALUE;
if (a < b) {
System.out.println("a < b");
}
if (a - b < 0) {
System.out.println("a - b < 0");
}
When run, this will only print a - b < 0
. What happens is that a < b
is clearly false, but a - b
overflows and becomes -1
, which is negative.
Now, having said that, consider that the array has a length that is really close to Integer.MAX_VALUE
. The code in ArrayList
goes like this:
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
oldCapacity
is really close to Integer.MAX_VALUE
so newCapacity
(which is oldCapacity + 0.5 * oldCapacity
) might overflow and become Integer.MIN_VALUE
(i.e. negative). Then, subtracting minCapacity
underflows back into a positive number.
This check ensures that the if
is not executed. If the code were written as if (newCapacity < minCapacity)
, it would be true
in this case (since newCapacity
is negative) so the newCapacity
would be forced to minCapacity
regardless of the oldCapacity
.
This overflow case is handled by the next if. When newCapacity
has overflowed, this will be true
: MAX_ARRAY_SIZE
is defined as Integer.MAX_VALUE - 8
and Integer.MIN_VALUE - (Integer.MAX_VALUE - 8) > 0
is true
. The newCapacity
is therefore rightly handled: hugeCapacity
method returns MAX_ARRAY_SIZE
or Integer.MAX_VALUE
.
NB: this is what the // overflow-conscious code
comment in this method is saying.
Solution 2 - Java
I found this explanation:
> On Tue, Mar 9, 2010 at 03:02, Kevin L. Stern
> > I did a quick search and it appears that Java is indeed two's complement
> > based. Nonetheless, please allow me to point out that, in general, this
> > type of code worries me since I fully expect that at some point someone will
> > come along and do exactly what Dmytro suggested; that is, someone will
> > change:
> >
> > if (a - b > 0)
> >
> > to
> >
> > if (a > b)
> >
> > and the entire ship will sink. I, personally, like to avoid obscurities
> > such as making integer overflow an essential basis for my algorithm unless
> > there is a good reason to do so. I would, in general, prefer to avoid
> > overflow altogether and to make the overflow scenario more explicit:
> >
> > if (oldCapacity > RESIZE_OVERFLOW_THRESHOLD) {
> > // Do something
> > } else {
> > // Do something else
> > }
>
> It's a good point.
>
> In ArrayList
we cannot do this (or at least not compatibly), because
> ensureCapacity
is a public API and effectively already accepts
> negative numbers as requests for a positive capacity that cannot be
> satisfied.
>
> The current API is used like this:
>
> int newcount = count + len;
> ensureCapacity(newcount);
>
> If you want to avoid overflow, you would need to change to something
> less natural like
>
> ensureCapacity(count, len);
> int newcount = count + len;
>
> Anyway, I'm keeping the overflow-conscious code, but adding more
> warning comments, and "out-lining" huge array creation so that
> ArrayList
's code now looks like:
>
>
> /**
> * Increases the capacity of this ArrayList instance, if
> * necessary, to ensure that it can hold at least the number of elements
> * specified by the minimum capacity argument.
> *
> * @param minCapacity the desired minimum capacity
> /
> public void ensureCapacity(int minCapacity) {
> modCount++;
>
> // Overflow-conscious code
> if (minCapacity - elementData.length > 0)
> grow(minCapacity);
> }
>
> /*
> * The maximum size of array to allocate.
> * Some VMs reserve some header words in an array.
> * Attempts to allocate larger arrays may result in
> * OutOfMemoryError: Requested array size exceeds VM limit
> /
> private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
>
> /*
> * Increases the capacity to ensure that it can hold at least the
> * number of elements specified by the minimum capacity argument.
> *
> * @param minCapacity the desired minimum capacity
> */
> private void grow(int minCapacity) {
> // Overflow-conscious code
> int oldCapacity = elementData.length;
> int newCapacity = oldCapacity + (oldCapacity >> 1);
> if (newCapacity - minCapacity < 0)
> newCapacity = minCapacity;
> if (newCapacity - MAX_ARRAY_SIZE > 0)
> newCapacity = hugeCapacity(minCapacity);
>
> // minCapacity is usually close to size, so this is a win:
> elementData = Arrays.copyOf(elementData, newCapacity);
> }
>
> private int hugeCapacity(int minCapacity) {
> if (minCapacity < 0) // overflow
> throw new OutOfMemoryError();
> return (minCapacity > MAX_ARRAY_SIZE) ?
> Integer.MAX_VALUE :
> MAX_ARRAY_SIZE;
> }
>
> Webrev regenerated.
>
> Martin
In Java 6, if you use the API as:
int newcount = count + len;
ensureCapacity(newcount);
And newCount
overflows (this becomes negative), if (minCapacity > oldCapacity)
will return false and you may mistakenly assume that the ArrayList
was increased by len
.
Solution 3 - Java
Looking at the code:
int newCapacity = oldCapacity + (oldCapacity >> 1);
If oldCapacity
is quite large, this will overflow, and newCapacity
will be a negative number. A comparison like newCapacity < oldCapacity
will incorrectly evaluate true
and the ArrayList
will fail to grow.
Instead, the code as written (newCapacity - minCapacity < 0
returns false) will allow the negative value of newCapacity
to be further evaluated in the next line, resulting in recalculating newCapacity
by invoking hugeCapacity
(newCapacity = hugeCapacity(minCapacity);
) to allow for the ArrayList
to grow up to MAX_ARRAY_SIZE
.
This is what the // overflow-conscious code
comment is trying to communicate, though rather obliquely.
So, bottom line, the new comparison protects against allocating an ArrayList
larger than the predefined MAX_ARRAY_SIZE
while allowing it to grow right up to that limit if needed.
Solution 4 - Java
The two forms behave exactly the same unless the expression a - b
overflows, in which case they are opposite. If a
is a large negative, and b
is a large positive, then (a < b)
is clearly true, but a - b
will overflow to become positive, so (a - b < 0)
is false.
If you're familiar with x86 assembly code, consider that (a < b)
is implemented by a jge
, which branches around the body of the if statement when SF = OF. On the other hand, (a - b < 0)
will act like a jns
, which branches when SF = 0. Hence, these behave differently precisely when OF = 1.