Why is processing a sorted array *slower* than an unsorted array? (Java's ArrayList.indexOf)

JavaPerformanceArraylist

Java Problem Overview


The title is in reference to https://stackoverflow.com/questions/11227809/why-is-processing-a-sorted-array-faster-than-an-unsorted-array

Is this a branch prediction effect, too? Beware: here the processing for the sorted array is slower!!

Consider the following code:

private static final int LIST_LENGTH = 1000 * 1000;
private static final long SLOW_ITERATION_MILLIS = 1000L * 10L;

@Test
public void testBinarySearch() {
	Random r = new Random(0);
	List<Double> list = new ArrayList<>(LIST_LENGTH);
	for (int i = 0; i < LIST_LENGTH; i++) {
		list.add(r.nextDouble());
	}
	//Collections.sort(list);
    // remove possible artifacts due to the sorting call
    // and rebuild the list from scratch:
	list = new ArrayList<>(list);
	
	int nIterations = 0;
	long startTime = System.currentTimeMillis();
	do {
		int index = r.nextInt(LIST_LENGTH);
		assertEquals(index, list.indexOf(list.get(index)));
		nIterations++;
	} while (System.currentTimeMillis() < startTime + SLOW_ITERATION_MILLIS);
	long duration = System.currentTimeMillis() - startTime;
	double slowFindsPerSec = (double) nIterations / duration * 1000;
	System.out.println(slowFindsPerSec);

    ...
}

This prints out a value of around 720 on my machine.

Now if I activate the collections sort call, that value drops down to 142. Why?!?

The results are conclusive, they don't change if I increase the number of iterations/time.

Java version is 1.8.0_71 (Oracle VM, 64 bit), running under Windows 10, JUnit test in Eclipse Mars.

UPDATE

Seems to be related to contiguous memory access (Double objects accessed in sequential order vs in random order). The effect starts vanish for me for array lengths of around 10k and less.

Thanks to assylias for providing the results:

/**
 * Benchmark                     Mode  Cnt  Score   Error  Units
 * SO35018999.shuffled           avgt   10  8.895 ± 1.534  ms/op
 * SO35018999.sorted             avgt   10  8.093 ± 3.093  ms/op
 * SO35018999.sorted_contiguous  avgt   10  1.665 ± 0.397  ms/op
 * SO35018999.unsorted           avgt   10  2.700 ± 0.302  ms/op
 */

Java Solutions


Solution 1 - Java

It looks like caching / prefetching effect.

The clue is that you compare Doubles (objects), not doubles (primitives). When you allocate objects in one thread, they are typically allocated sequentially in memory. So when indexOf scans a list, it goes through sequential memory addresses. This is good for CPU cache prefetching heuristics.

But after you sort the list, you still have to do the same number of memory lookups in average, but this time memory access will be in random order.

UPDATE

Here is the benchmark to prove that the order of allocated objects matters.

Benchmark            (generator)  (length)  (postprocess)  Mode  Cnt  Score   Error  Units
ListIndexOf.indexOf       random   1000000           none  avgt   10  1,243 ± 0,031  ms/op
ListIndexOf.indexOf       random   1000000           sort  avgt   10  6,496 ± 0,456  ms/op
ListIndexOf.indexOf       random   1000000        shuffle  avgt   10  6,485 ± 0,412  ms/op
ListIndexOf.indexOf   sequential   1000000           none  avgt   10  1,249 ± 0,053  ms/op
ListIndexOf.indexOf   sequential   1000000           sort  avgt   10  1,247 ± 0,037  ms/op
ListIndexOf.indexOf   sequential   1000000        shuffle  avgt   10  6,579 ± 0,448  ms/op

Solution 2 - Java

I think we are seeing the effect of memory cache misses:

When you create the unsorted list

for (int i = 0; i < LIST_LENGTH; i++) {
    list.add(r.nextDouble());
}

all the double are most likely allocated in a contiguous memory area. Iterating through this will produce few cache misses.

On the other hand in the sorted list the references point to memory in a chaotic manner.

Now if you create a sorted list with contiguous memory:

Collection.sort(list);
List<Double> list2 = new ArrayList<>();
for (int i = 0; i < LIST_LENGTH; i++) {
  	list2.add(new Double(list.get(i).doubleValue()));
}

this sorted list has the same performance than the original one (my timing).

Solution 3 - Java

As a simple example that confirms the answer by wero and the answer by apangin (+1!): The following does a simple comparison of both options:

  • Creating random numbers, and sorting them optionally
  • Creating sequential numbers, and shuffling them optionally

It is also not implemented as a JMH benchmark, but similar to the original code, with only slight modifications to observe the effect:

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Random;

public class SortedListTest
{
    private static final long SLOW_ITERATION_MILLIS = 1000L * 3L;

    public static void main(String[] args)
    {
        int size = 100000;
        testBinarySearchOriginal(size, true);
        testBinarySearchOriginal(size, false);
        testBinarySearchShuffled(size, true);
        testBinarySearchShuffled(size, false);
    }

    public static void testBinarySearchOriginal(int size, boolean sort)
    {
        Random r = new Random(0);
        List<Double> list = new ArrayList<>(size);
        for (int i = 0; i < size; i++)
        {
            list.add(r.nextDouble());
        }
        if (sort)
        {
            Collections.sort(list);
        }
        list = new ArrayList<>(list);

        int count = 0;
        int nIterations = 0;
        long startTime = System.currentTimeMillis();
        do
        {
            int index = r.nextInt(size);
            if (index == list.indexOf(list.get(index)))
            {
                count++;
            }
            nIterations++;
        }
        while (System.currentTimeMillis() < startTime + SLOW_ITERATION_MILLIS);
        long duration = System.currentTimeMillis() - startTime;
        double slowFindsPerSec = (double) nIterations / duration * 1000;

        System.out.printf("Size %8d sort %5s iterations %10.3f count %10d\n",
            size, sort, slowFindsPerSec, count);
    }

    public static void testBinarySearchShuffled(int size, boolean sort)
    {
        Random r = new Random(0);
        List<Double> list = new ArrayList<>(size);
        for (int i = 0; i < size; i++)
        {
            list.add((double) i / size);
        }
        if (!sort)
        {
            Collections.shuffle(list);
        }
        list = new ArrayList<>(list);

        int count = 0;
        int nIterations = 0;
        long startTime = System.currentTimeMillis();
        do
        {
            int index = r.nextInt(size);
            if (index == list.indexOf(list.get(index)))
            {
                count++;
            }
            nIterations++;
        }
        while (System.currentTimeMillis() < startTime + SLOW_ITERATION_MILLIS);
        long duration = System.currentTimeMillis() - startTime;
        double slowFindsPerSec = (double) nIterations / duration * 1000;

        System.out.printf("Size %8d sort %5s iterations %10.3f count %10d\n",
            size, sort, slowFindsPerSec, count);
    }

}

The output on my machine is

Size   100000 sort  true iterations   8560,333 count      25681
Size   100000 sort false iterations  19358,667 count      58076
Size   100000 sort  true iterations  18554,000 count      55662
Size   100000 sort false iterations   8845,333 count      26536

nicely showing that the timings are exactly the opposites of another: If random numbers are sorted, then the sorted version is slower. If sequential numbers are shuffled, then the shuffled version is slower.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser1050755View Question on Stackoverflow
Solution 1 - JavaapanginView Answer on Stackoverflow
Solution 2 - JavaweroView Answer on Stackoverflow
Solution 3 - JavaMarco13View Answer on Stackoverflow