ArrayList Vs LinkedList

JavaData StructuresCollectionsArraylistLinked List

Java Problem Overview


I was following a previous post on this that says:

> For LinkedList > > * get is O(n) > * add is O(1) > * remove is O(n) > * Iterator.remove is O(1) > > For ArrayList > > * get is O(1) > * add is O(1) amortized, but O(n) worst-case since the array must be resized and copied > * remove is O(n)

So by looking at this, I concluded that if I've to do just sequential insert in my collection for say 5000000 elements, LinkedList will outclass ArrayList.

And if I've to just fetch the elements from collection by iterating i.e. not grabbing the element in middle, still LinkedList will outclass `ArrayList.

Now to verify my above two statements, I wrote below sample program… But I'm surprised that my above statements were proven wrong.

ArrayList outclassed Linkedlist in both the cases. It took less time than LinkedList for adding as well as fetching them from Collection. Is there anything I'm doing wrong, or the initial statements about LinkedList and ArrayList does not hold true for collections of size 5000000?

I mentioned size, because if I reduce the number of elements to 50000, LinkedList performs better and initial statements hold true.

long nano1 = System.nanoTime();

List<Integer> arr = new ArrayList();
for(int i = 0; i < 5000000; ++i) {
    arr.add(i);
}
System.out.println( (System.nanoTime() - nano1) );

for(int j : arr) {
    ;
}
System.out.println( (System.nanoTime() - nano1) );

long nano2 = System.nanoTime();

List<Integer> arrL = new LinkedList();
for(int i = 0; i < 5000000; ++i) {
    arrL.add(i);
}
System.out.println( (System.nanoTime() - nano2) );

for(int j : arrL) {
    ;
}
System.out.println( (System.nanoTime() - nano2) );

Java Solutions


Solution 1 - Java

Remember that big-O complexity describes asymptotic behaviour and may not reflect actual implementation speed. It describes how the cost of each operation grows with the size of the list, not the speed of each operation. For example, the following implementation of add is O(1) but is not fast:

public class MyList extends LinkedList {
    public void add(Object o) {
        Thread.sleep(10000);
        super.add(o);
    }
}

I suspect in your case ArrayList is performing well because it increases it's internal buffer size fairly aggressively so there will not be a large number of reallocations. When the buffer does not need to be resized ArrayList will have faster adds.

You also need to be very careful when you do this kind of profiling. I'd suggest you change your profiling code to do a warm-up phase (so the JIT has the opportunity to do some optimization without affecting your results) and average the results over a number of runs.

private final static int WARMUP = 1000;
private final static int TEST = 1000;
private final static int SIZE = 500000;

public void perfTest() {
    // Warmup
    for (int i = 0; i < WARMUP; ++i) {
        buildArrayList();
    }
    // Test
    long sum = 0;
    for (int i = 0; i < TEST; ++i) {
        sum += buildArrayList();
    }
    System.out.println("Average time to build array list: " + (sum / TEST));
}

public long buildArrayList() {
    long start = System.nanoTime();
    ArrayList a = new ArrayList();
    for (int i = 0; i < SIZE; ++i) {
        a.add(i);
    }
    long end = System.nanoTime();
    return end - start;
}

... same for buildLinkedList

(Note that sum may overflow and you might be better to use System.currentTimeMillis()).

It's also possible that the compiler is optimizing away your empty get loops. Make sure the loop actually does something to ensure that the right code is getting called.

Solution 2 - Java

This is a bad benchmark IMO.

  • need to repeat in loop multiple times to warm up jvm
  • need to DO something in your iterative loop or it can be optimized array
  • ArrayList resizes, which is costly. If you had constructed ArrayList as new ArrayList(500000) you would construct in one blow, and then all allocations would be quite cheap (one preallocating backed array)
  • You don't specify your memory JVM - it should be run with -xMs == -Xmx (everything preallocated) and sufficiently high that no GC is likely to be triggered
  • This benchmark doesn't cover the most unpleasant aspect of LinkedList - random access. (an iterator isn't necessarily the same thing). If you feed say 10% of the size of a large collection as a random selection of list.get you will find linkedlists are awful for grabbing anything other than the first or last element.

For an arraylist: the jdk get is what you'd expect:

public E get(int index) {
	RangeCheck(index);

	return elementData[index];
}

(basically just return the indexed array element.,

For a linkedlist:

public E get(int index) {
    return entry(index).element;
}

looks similar? Not quite. entry is a method not an primitive array, and look what it has to do:

private Entry<E> entry(int index) {
    if (index < 0 || index >= size)
        throw new IndexOutOfBoundsException("Index: "+index+
                                            ", Size: "+size);
    Entry<E> e = header;
    if (index < (size >> 1)) {
        for (int i = 0; i <= index; i++)
            e = e.next;
    } else {
        for (int i = size; i > index; i--)
            e = e.previous;
    }
    return e;
}

That's right, if you ask for say list.get(250000), it's gotta start at the head and repeatedly iterate through the next element. 250000 accesses or so (there's an optimization in the code where it starts at head or tail depending on which would be less accesses.)

Solution 3 - Java

An ArrayList is a simpler data structure than a LinkedList. An ArrayList has a single array of pointers in contiguous memory locations. It only has to be recreated if the array is expanded beyond its allocated size.

A LinkedList consists of a chain of nodes; each node is separated allocated and has front and back pointers to other nodes.

So what does this mean? Unless you need to insert in the middle, splice, delete in the middle etc. an ArrayList will usually be faster. It needs less memory allocations, has much better locality of reference (which is important for processor caching) etc.

Solution 4 - Java

To understand why the results you got do not contradict the "big O" characterization. We need to go back to first principles; i.e. the definition.

> Let f(x) and g(x) be two functions defined on some subset of the real numbers. One writes > > f(x) = O(g(x)) as x -> infinity > > if and only if, for sufficiently large values of x, f(x) is at most a constant multiplied by g(x) in absolute value. That is, f(x) = O(g(x)) if and only if there exists a positive real number M and a real number x0 such that > > |f(x)| <= M |g(x)| for all x > x_0. > > In many contexts, the assumption that we are interested in the growth rate as the variable x goes to infinity is left unstated, and one writes more simply that f(x) = O(g(x)).

So, the statement add1 is O(1), means is that the time cost of an add1 operation on a list of size N tends towards a constant Cadd1 as N tends to infinity.

And the statement add2 is O(1) amortized over N operations, means is that the average time cost of one of a sequence of N add2 operations tends towards a constant Cadd2 as N tends to infinity.

What is does not say is what those constants Cadd1 and Cadd2 are. In fact the reason that LinkedList is slower than ArrayList in your benchmark is that Cadd1 is larger than Cadd2.

The lesson is that big O notation does not predict absolute or even relative performance. All it predicts is the shape of the performance function as the controlling variable gets very large. This is useful to know, but it doesn't tell you everything you need to know.

Solution 5 - Java

1) Underlying Data Structure The first difference between ArrayList and LinkedList comes with the fact that ArrayList is backed by Array while LinkedList is backed by LinkedList. This will lead further differences in performance.

2) LinkedList implements Deque Another difference between ArrayList and LinkedList is that apart from the List interface, LinkedList also implements Deque interface, which provides first in first out operations for add() and poll() and several other Deque functions. 3) Adding elements in ArrayList Adding element in ArrayList is O(1) operation if it doesn't trigger re-size of Array, in which case it becomes O(log(n)), On the other hand appending an element in LinkedList is O(1) operation, as it doesn't require any navigation.

4) Removing element from a position In order to remove an element from a particular index e.g. by calling remove(index), ArrayList performs a copy operation which makes it close to O(n) while LinkedList needs to traverse to that point which also makes it O(n/2), as it can traverse from either direction based upon proximity.

5) Iterating over ArrayList or LinkedList Iteration is the O(n) operation for both LinkedList and ArrayList where n is a number of an element.

6) Retrieving element from a position The get(index) operation is O(1) in ArrayList while its O(n/2) in LinkedList, as it needs to traverse till that entry. Though, in Big O notation O(n/2) is just O(n) because we ignore constants there.

7) Memory LinkedList uses a wrapper object, Entry, which is a static nested class for storing data and two nodes next and previous while ArrayList just stores data in Array.

So memory requirement seems less in the case of ArrayList than LinkedList except for the case where Array performs the re-size operation when it copies content from one Array to another.

If Array is large enough it may take a lot of memory at that point and trigger Garbage collection, which can slow response time.

From all the above differences between ArrayList vs LinkedList, It looks ArrayList is the better choice than LinkedList in almost all cases, except when you do a frequent add() operation than remove(), or get().

It's easier to modify a linked list than ArrayList, especially if you are adding or removing elements from start or end because linked list internally keeps references of those positions and they are accessible in O(1) time.

In other words, you don't need to traverse through the linked list to reach the position where you want to add elements, in that case, addition becomes O(n) operation. For example, inserting or deleting an element in the middle of a linked list.

In my opinion, use ArrayList over LinkedList for most of the practical purpose in Java.

Solution 6 - Java

The big-O-notation is not about absolut timings, but about relative timings, and you can't compare the numbers of one algorithm to another.

You only get information how the same algorithm reacts to increasing or decreasing numbers of tuples.

One algorithm might take an hour for one operation, and 2h for two operations, and is O(n), and another one is O(n) too, and takes one millisecond for one operation, and two milliseconds for two operations.

Another issue if measuring with the JVM is the optimization of the hotspot-compiler. A do-nothing-loop might be eliminated by the JIT-compiler.

A third thing to consider is the OS and JVM, using caches and running the garbage collection meanwhile.

Solution 7 - Java

It's hard to find a good use case for LinkedList. If you only need to make use of the Dequeu interface, you should probably use ArrayDeque. If you really need to use the List interface, you will often hear the suggestion to use always ArrayList because LinkedList behaves really poorly in accessing a random element.

Unfortunately also ArrayList has its performance problems if elements at the beginning or in the middle of the list must be removed or inserted.

There is however a new list implementation called GapList which combines the strengths of both ArrayList and LinkedList. It has been designed as drop-in replacement for both ArrayList and LinkedList and therefore implements both the interfaces List and Deque. Also all public methods provided by ArrayList are implemented (ensureCapacty, trimToSize).

GapList's implementation guarantees efficient random access to elements by index (as ArrayList does) and at the same time efficient adding and removing elements to and from head and tail of the list (as LinkedList does).

You find more information about GapList at http://java.dzone.com/articles/gaplist-%E2%80%93-lightning-fast-list.

Solution 8 - Java

O notation analysis provides important information, but it has it's limitations. By definition O notation analysis considers that every operation takes approximately the same time to execute, which is not true. As @seand pointed out, linked lists internally uses more complex logic to insert and fetch elements (take a look at the source code, you can ctrl+click in your IDE). ArrayList internally only needs to insert elements into an array and increase its size once in a while (which even being an o(n) operation, in practice can be accomplished pretty fast).

Cheers

Solution 9 - Java

You can separate add or remove as a two step operation.

> LinkedList: If you add a element to index n, you can move the pointer from 0 to n-1, then you can perform your so called O(1) add operation. Remove operation is the same.


> ArraryList: ArrayList implements the RandomAccess interface, which means it can access a element in O(1).
If you add a element in index n, it can go to the the n-1 index in O(1), move the elements after n-1, add set the element in the n slot.
The moving operation is performed by a native method called System.arraycopy, it's pretty fast.

public static void main(String[] args) {

    List<Integer> arrayList = new ArrayList<Integer>();
    for (int i = 0; i < 100000; i++) {
        arrayList.add(i);
    }

    List<Integer> linkList = new LinkedList<Integer>();

    long start = 0;
    long end = 0;
    Random random = new Random();

    start = System.currentTimeMillis();
    for (int i = 0; i < 10000; i++) {
        linkList.add(random.nextInt(100000), 7);
    }
    end = System.currentTimeMillis();
    System.out.println("LinkedList add ,random index" + (end - start));

    start = System.currentTimeMillis();
    for (int i = 0; i < 10000; i++) {
        arrayList.add(random.nextInt(100000), 7);
    }
    end = System.currentTimeMillis();
    System.out.println("ArrayList add ,random index" + (end - start));

    start = System.currentTimeMillis();
    for (int i = 0; i < 10000; i++) {
        linkList.add(0, 7);
    }
    end = System.currentTimeMillis();
    System.out.println("LinkedList add ,index == 0" + (end - start));

    start = System.currentTimeMillis();
    for (int i = 0; i < 10000; i++) {
        arrayList.add(0, 7);
    }
    end = System.currentTimeMillis();
    System.out.println("ArrayList add ,index == 0" + (end - start));

    start = System.currentTimeMillis();
    for (int i = 0; i < 10000; i++) {
        linkList.add(i);
    }
    end = System.currentTimeMillis();
    System.out.println("LinkedList add ,index == size-1" + (end - start));

    start = System.currentTimeMillis();
    for (int i = 0; i < 10000; i++) {
        arrayList.add(i);
    }
    end = System.currentTimeMillis();
    System.out.println("ArrayList add ,index == size-1" + (end - start));

    start = System.currentTimeMillis();
    for (int i = 0; i < 10000; i++) {
        linkList.remove(Integer.valueOf(random.nextInt(100000)));
    }
    end = System.currentTimeMillis();
    System.out.println("LinkedList remove ,random index" + (end - start));

    start = System.currentTimeMillis();
    for (int i = 0; i < 10000; i++) {
        arrayList.remove(Integer.valueOf(random.nextInt(100000)));
    }
    end = System.currentTimeMillis();
    System.out.println("ArrayList remove ,random index" + (end - start));

    start = System.currentTimeMillis();
    for (int i = 0; i < 10000; i++) {
        linkList.remove(0);
    }
    end = System.currentTimeMillis();
    System.out.println("LinkedList remove ,index == 0" + (end - start));

    start = System.currentTimeMillis();
    for (int i = 0; i < 10000; i++) {
        arrayList.remove(0);
    }
    end = System.currentTimeMillis();
    System.out.println("ArrayList remove ,index == 0" + (end - start));
    
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionVickyView Question on Stackoverflow
Solution 1 - JavaCameron SkinnerView Answer on Stackoverflow
Solution 2 - JavaMJBView Answer on Stackoverflow
Solution 3 - JavaseandView Answer on Stackoverflow
Solution 4 - JavaStephen CView Answer on Stackoverflow
Solution 5 - JavaAnjali SumanView Answer on Stackoverflow
Solution 6 - Javauser unknownView Answer on Stackoverflow
Solution 7 - JavaThomas MauchView Answer on Stackoverflow
Solution 8 - JavaAnthony AcciolyView Answer on Stackoverflow
Solution 9 - JavadingjshView Answer on Stackoverflow