Why is AddRange faster than using a foreach loop?
C#.NetC# 4.0C# Problem Overview
var fillData = new List<int>();
for (var i = 0; i < 100000; i++)
fillData.Add(i);
var stopwatch1 = new Stopwatch();
stopwatch1.Start();
var autoFill = new List<int>();
autoFill.AddRange(fillData);
stopwatch1.Stop();
var stopwatch2 = new Stopwatch();
stopwatch2.Start();
var manualFill = new List<int>();
foreach (var i in fillData)
manualFill.Add(i);
stopwatch2.Stop();
When I take 4 results from stopwach1
and stopwach2
, stopwatch1
has always lower value than stopwatch2
. That means addrange
is always faster than foreach
.
Does anyone know why?
C# Solutions
Solution 1 - C#
Potentially, AddRange
can check where the value passed to it implements IList
or IList<T>
. If it does, it can find out how many values are in the range, and thus how much space it needs to allocate... whereas the foreach
loop may need to reallocate several times.
Additionally, even after allocation, List<T>
can use IList<T>.CopyTo
to perform a bulk copy into the underlying array (for ranges which implement IList<T>
, of course.)
I suspect you'll find that if you try your test again but using Enumerable.Range(0, 100000)
for fillData
instead of a List<T>
, the two will take about the same time.
Solution 2 - C#
If you are using Add
, it is resizing the inner array gradually as needed (doubling), from the default starting size of 10 (IIRC). If you use:
var manualFill = new List<int>(fillData.Count);
I expect it'll change radically (no more resizes / data copy).
From reflector, AddRange
does this internally, rather than growing in doubling:
ICollection<T> is2 = collection as ICollection<T>;
if (is2 != null)
{
int count = is2.Count;
if (count > 0)
{
this.EnsureCapacity(this._size + count);
// ^^^ this the key bit, and prevents slow growth when possible ^^^
Solution 3 - C#
Because AddRange
checks size of added items and increases size of internal array only once.
Solution 4 - C#
The dissassembly from reflector for the List
ICollection<T> is2 = collection as ICollection<T>;
if (is2 != null)
{
int count = is2.Count;
if (count > 0)
{
this.EnsureCapacity(this._size + count);
if (index < this._size)
{
Array.Copy(this._items, index, this._items, index + count, this._size - index);
}
if (this == is2)
{
Array.Copy(this._items, 0, this._items, index, index);
Array.Copy(this._items, (int) (index + count), this._items, (int) (index * 2), (int) (this._size - index));
}
else
{
T[] array = new T[count];
is2.CopyTo(array, 0);
array.CopyTo(this._items, index);
}
this._size += count;
}
}
As you can see there are some optimizations like EnsureCapacity() call and using Array.Copy().
Solution 5 - C#
When using AddRange
the Collection can increase the size of the array once and then copy the values into it.
Using a foreach
statement the collection needs to increase size of the collection more than once.
Increasing thr size means copying the complete array which takes time.
Solution 6 - C#
This is like asking the waiter to bring you one beer ten times and asking him to bring you 10 beers at once.
What do you think is faster :)
Solution 7 - C#
i suppose this is the result of optimisation of memory allocation. for AddRange memory allocates only once, and while foreach on each iteration reallocation is done.
also may be there are some optimisations in AddRange implementation (memcpy for example)
Solution 8 - C#
Try out initialize intiial list capacity before manually adding items:
var manualFill = new List<int>(fillData.Count);
Solution 9 - C#
It is because the Foreach loop will add all the values that the loop is getting one a time and
the AddRange() method will gather all the values it is getting as a "chunk" and add that chunk at once to the specified location.
Simply understanding, it is just like you have a list of 10 items to bring from the market, which would be faster bringing all that one by one or all at single time.