Why are multi-dimensional arrays in .NET slower than normal arrays?

.Net Problem Overview

Edit: I apologize everybody. I used the term "jagged array" when I actually meant to say "multi-dimensional array" (as can be seen in my example below). I apologize for using the incorrect name. I actually found jagged arrays to be faster than multi-dimensional ones! I have added my measurements for jagged arrays.

I was trying to use a ~~jagged~~ multi-dimensional array today, when I noticed that it's performance is not as I would have expected. Using a single-dimensional array and manually calculating indices was much faster (almost two times) than using a 2D array. I wrote a test using 1024*1024 arrays (initialized to random values), for 1000 iterations, and I got the following results on my machine:

sum(double[], int): 2738 ms (100%)
sum(double[,]):     5019 ms (183%)
sum(double[][]):    2540 ms ( 93%)

This is my test code:

public static double sum(double[] d, int l1) {
    // assuming the array is rectangular
    double sum = 0;
    int l2 = d.Length / l1;
    for (int i = 0; i < l1; ++i)
        for (int j = 0; j < l2; ++j)
            sum += d[i * l2 + j];
    return sum;
}

public static double sum(double[,] d) {
    double sum = 0;
    int l1 = d.GetLength(0);
    int l2 = d.GetLength(1);
    for (int i = 0; i < l1; ++i)
        for (int j = 0; j < l2; ++j)
            sum += d[i, j];
    return sum;
}

public static double sum(double[][] d) {
    double sum = 0;
    for (int i = 0; i < d.Length; ++i)
        for (int j = 0; j < d[i].Length; ++j)
            sum += d[i][j];
    return sum;
}

public static void Main() {
    Random random = new Random();
    const int l1  = 1024, l2 = 1024;
    double[ ] d1  = new double[l1 * l2];
    double[,] d2  = new double[l1 , l2];
    double[][] d3 = new double[l1][];

    for (int i = 0; i < l1; ++i) {
        d3[i] = new double[l2];
        for (int j = 0; j < l2; ++j)
            d3[i][j] = d2[i, j] = d1[i * l2 + j] = random.NextDouble();
    }
    //
    const int iterations = 1000;
    TestTime(sum, d1, l1, iterations);
    TestTime(sum, d2, iterations);
    TestTime(sum, d3, iterations);
}

Further investigation showed that the IL for the second method is 23% larger than that of the first method. (Code size 68 vs 52.) This is mostly due to calls to System.Array::GetLength(int). The compiler also emits calls to Array::Get for the ~~jagged~~ multi-dimensional array, whereas it simply calls ldelem for the simple array.

So I am wondering, why is access through multi-dimensional arrays slower than normal arrays? I would have assumed the compiler (or JIT) would do something similar to what I did in my first method, but this was not actually the case.

Could you plese help me understand why this is happening the way it is?

Update: Following Henk Holterman's suggestion, here is the implementation of TestTime:

public static void TestTime<T, TR>(Func<T, TR> action, T obj,
                                   int iterations)
{
    Stopwatch stopwatch = Stopwatch.StartNew();
    for (int i = 0; i < iterations; ++i)
        action(obj);
    Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
}

public static void TestTime<T1, T2, TR>(Func<T1, T2, TR> action, T1 obj1,
                                        T2 obj2, int iterations)
{
    Stopwatch stopwatch = Stopwatch.StartNew();
    for (int i = 0; i < iterations; ++i)
        action(obj1, obj2);
    Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
}

.Net Solutions

Solution 1 - .Net

Single dimensional arrays with a lower bound of 0 are a different type to either multi-dimensional or non-0 lower bound arrays within IL (vector vs array IIRC). vector is simpler to work with - to get to element x, you just do pointer + size * x. For an array, you have to do pointer + size * (x-lower bound) for a single dimensional array, and yet more arithmetic for each dimension you add.

Basically the CLR is optimised for the vastly more common case.

Solution 2 - .Net

Array bounds checking?

The single-dimension array has a length member that you access directly - when compiled this is just a memory read.

The multidimensional array requires a GetLength(int dimension) method call that processes the argument to get the relevant length for that dimension. That doesn't compile down to a memory read, so you get a method call, etc.

In addition that GetLength(int dimension) will do a bounds check on the parameter.

Solution 3 - .Net

Interestly, I ran the following code from above using VS2008 NET3.5SP1 Win32 on a Vista box, and in release/optimize the difference was barely measurable, while debug/noopt the multi-dim arrays were much slower. (I ran the three tests twice to reduce JIT affects on the second set.)

  Here are my numbers: 
    sum took 00:00:04.3356535
    sum took 00:00:04.1957663
    sum took 00:00:04.5523050
    sum took 00:00:04.0183060
    sum took 00:00:04.1785843 
    sum took 00:00:04.4933085

Look at the second set of three numbers. The difference is not enough for me to code everything in single dimension arrays.

Although I haven't posted them, in Debug/unoptimized the multidimension vs. single/jagged does make a huge difference.

Full program:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;

namespace single_dimension_vs_multidimension
{
    class Program
    {
       

        public static double sum(double[] d, int l1) {    // assuming the array is rectangular 
            double sum = 0; 
            int l2 = d.Length / l1; 
            for (int i = 0; i < l1; ++i)   
                for (int j = 0; j < l2; ++j)   
                    sum += d[i * l2 + j];   
            return sum;
        }
        
        public static double sum(double[,] d)
        {
            double sum = 0;  
            int l1 = d.GetLength(0);
            int l2 = d.GetLength(1);   
            for (int i = 0; i < l1; ++i)    
                for (int j = 0; j < l2; ++j)   
                    sum += d[i, j]; 
            return sum;
        }
        public static double sum(double[][] d)
        {
            double sum = 0;   
            for (int i = 0; i < d.Length; ++i) 
                for (int j = 0; j < d[i].Length; ++j) 
                    sum += d[i][j];
            return sum;
        }
        public static void TestTime<T, TR>(Func<T, TR> action, T obj, int iterations) 
        { 
            Stopwatch stopwatch = Stopwatch.StartNew();
            for (int i = 0; i < iterations; ++i)      
                action(obj);
            Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
        }
        public static void TestTime<T1, T2, TR>(Func<T1, T2, TR> action, T1 obj1, T2 obj2, int iterations)
        {
            Stopwatch stopwatch = Stopwatch.StartNew(); 
            for (int i = 0; i < iterations; ++i)    
                action(obj1, obj2); 
            Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
        }
        public static void Main() {   
            Random random = new Random(); 
            const int l1  = 1024, l2 = 1024; 
            double[ ] d1  = new double[l1 * l2]; 
            double[,] d2  = new double[l1 , l2];  
            double[][] d3 = new double[l1][];   
            for (int i = 0; i < l1; ++i)
            {
                d3[i] = new double[l2];   
                for (int j = 0; j < l2; ++j)  
                    d3[i][j] = d2[i, j] = d1[i * l2 + j] = random.NextDouble();
            }    
            const int iterations = 1000;
            TestTime<double[], int, double>(sum, d1, l1, iterations);
            TestTime<double[,], double>(sum, d2, iterations);

            TestTime<double[][], double>(sum, d3, iterations);
            TestTime<double[], int, double>(sum, d1, l1, iterations);
            TestTime<double[,], double>(sum, d2, iterations);
            TestTime<double[][], double>(sum, d3, iterations); 
        }
        
    }
}

Solution 4 - .Net

Because a multidimensional array is just a syntactic sugar as it is really just a flat array with some index calculation magic. On the other hand, a jagged array is like, an array of arrays. With a two-dimensional array, accessing an element requires reading the memory just once, while with a two level jagged array, you need to read the memory twice.

EDIT: Apparently the original poster mixed up "jagged arrays" with "multi-dimensional arrays" so my reasoning doesn't exactly stand. For the real reason, check Jon Skeet's heavy artillery answer above.

Solution 5 - .Net

Which is fastest depends your arrays sizes.

Image for easy read:

Console result:

// * Summary *

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.997 (1909/November2018Update/19H2)
Intel Core i7-6700HQ CPU 2.60GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.1.302
  [Host]        : .NET Core 3.1.6 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.31603), X64 RyuJIT
  .NET Core 3.1 : .NET Core 3.1.6 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.31603), X64 RyuJIT

Job=.NET Core 3.1  Runtime=.NET Core 3.1

|           Method |    D |            Mean |         Error |        StdDev |      Gen 0 |     Gen 1 |     Gen 2 |  Allocated |
|----------------- |----- |----------------:|--------------:|--------------:|-----------:|----------:|----------:|-----------:|
| 'double[D1][D2]' |   10 |        376.2 ns |       7.57 ns |      12.00 ns |     0.3643 |         - |         - |     1144 B |
| 'double[D1, D2]' |   10 |        325.5 ns |       3.71 ns |       3.47 ns |     0.2675 |         - |         - |      840 B |
| 'double[D1][D2]' |   50 |      4,821.4 ns |      44.71 ns |      37.34 ns |     6.8893 |         - |         - |    21624 B |
| 'double[D1, D2]' |   50 |      5,834.1 ns |      64.35 ns |      60.20 ns |     6.3629 |         - |         - |    20040 B |
| 'double[D1][D2]' |  100 |     19,124.4 ns |     230.39 ns |     454.77 ns |    26.2756 |    0.7019 |         - |    83224 B |
| 'double[D1, D2]' |  100 |     23,561.4 ns |     299.18 ns |     279.85 ns |    24.9939 |         - |         - |    80040 B |
| 'double[D1][D2]' |  500 |  1,248,458.7 ns |  11,241.19 ns |  10,515.01 ns |   322.2656 |  160.1563 |         - |  2016025 B |
| 'double[D1, D2]' |  500 |    966,940.8 ns |   5,694.46 ns |   5,326.60 ns |   303.7109 |  303.7109 |  303.7109 |  2000034 B |
| 'double[D1][D2]' | 1000 |  8,987,202.8 ns |  97,133.16 ns |  90,858.41 ns |  1421.8750 |  578.1250 |  265.6250 |  8032582 B |
| 'double[D1, D2]' | 1000 |  3,628,421.3 ns |  72,240.02 ns | 177,206.01 ns |   179.6875 |  179.6875 |  179.6875 |  8000036 B |
| 'double[D1][D2]' | 1500 | 26,496,994.4 ns | 380,625.25 ns | 356,037.09 ns |  3406.2500 | 1500.0000 |  531.2500 | 18048064 B |
| 'double[D1, D2]' | 1500 | 12,417,733.7 ns | 243,802.76 ns | 260,866.22 ns |   156.2500 |  156.2500 |  156.2500 | 18000038 B |
| 'double[D1][D2]' | 3000 | 86,943,097.4 ns | 485,339.32 ns | 405,280.31 ns | 12833.3333 | 7000.0000 | 1333.3333 | 72096325 B |
| 'double[D1, D2]' | 3000 | 57,969,405.9 ns | 393,463.61 ns | 368,046.11 ns |   222.2222 |  222.2222 |  222.2222 | 72000100 B |

// * Hints *
Outliers
  MultidimensionalArrayBenchmark.'double[D1][D2]': .NET Core 3.1 -> 1 outlier  was  removed (449.71 ns)
  MultidimensionalArrayBenchmark.'double[D1][D2]': .NET Core 3.1 -> 2 outliers were removed, 3 outliers were detected (4.75 us, 5.10 us, 5.28 us)
  MultidimensionalArrayBenchmark.'double[D1][D2]': .NET Core 3.1 -> 13 outliers were removed (21.27 us..30.62 us)
  MultidimensionalArrayBenchmark.'double[D1, D2]': .NET Core 3.1 -> 1 outlier  was  removed (4.19 ms)
  MultidimensionalArrayBenchmark.'double[D1, D2]': .NET Core 3.1 -> 3 outliers were removed, 4 outliers were detected (11.41 ms, 12.94 ms..13.61 ms)
  MultidimensionalArrayBenchmark.'double[D1][D2]': .NET Core 3.1 -> 2 outliers were removed (88.68 ms, 89.27 ms)

// * Legends *
  D         : Value of the 'D' parameter
  Mean      : Arithmetic mean of all measurements
  Error     : Half of 99.9% confidence interval
  StdDev    : Standard deviation of all measurements
  Gen 0     : GC Generation 0 collects per 1000 operations
  Gen 1     : GC Generation 1 collects per 1000 operations
  Gen 2     : GC Generation 2 collects per 1000 operations
  Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
  1 ns      : 1 Nanosecond (0.000000001 sec)

Benchmark code:

[SimpleJob(BenchmarkDotNet.Jobs.RuntimeMoniker.NetCoreApp31)]
[MemoryDiagnoser]
public class MultidimensionalArrayBenchmark {
    [Params(10, 50, 100, 500, 1000, 1500, 3000)]
    public int D { get; set; }

    [Benchmark(Description = "double[D1][D2]")]
    public double[][] JaggedArray() {
        var array = new double[D][];
        for (int i = 0; i < array.Length; i++) {
            var subArray = new double[D];
            array[i] = subArray;

            for (int j = 0; j < subArray.Length; j++) {
                subArray[j] = j + i * 10;
            }
        }

        return array;
    }

    [Benchmark(Description = "double[D1, D2]")]
    public double[,] MultidimensionalArray() {
        var array = new double[D, D];
        for (int i = 0; i < D; i++) {
            for (int j = 0; j < D; j++) {
                array[i, j] = j + i * 10;
            }
        }

        return array;
    }
}

Solution 6 - .Net

Jagged arrays are arrays of class references (other arrays) up until the leaf array which may be an array of a primitive type. Hence memory allocated for each of the other arrays can be all over the place.

Whereas a mutli-dimensional array has its memory allocated in one contigeous lump.

Solution 7 - .Net

I think it has got something to do for the fact that jagged arrays are actually arrays of arrays hence there are two levels of indirection to get to the actual data.

Solution 8 - .Net

I'm with everyone else here

I had a program with three dimension array, let me tell you that when I moved the array into two dimension, I saw a huge boost and then I moved to a one dimension array.

In the end, I think I saw over 500% performance boost in the execution time.

only drawback was the complexity added to find out where was what in the one dimensional array, versus the three one.

Solution 9 - .Net

I think multi-dimensional is slower, the runtime has to check two or more(three dimensional and up) bounds check.

Solution 10 - .Net

Bounds checking. Your "j" variable could exceed l2 provided "i" was less than l1. This would not be legal in the second example

Content Type	Original Author	Original Content on Stackoverflow
Question	Hosam Aly	View Question on Stackoverflow
Solution 1 - .Net	Jon Skeet	View Answer on Stackoverflow
Solution 2 - .Net	JeeBee	View Answer on Stackoverflow
Solution 3 - .Net	Cameron	View Answer on Stackoverflow
Solution 4 - .Net	Tamas Czinege	View Answer on Stackoverflow
Solution 5 - .Net	Joe Huang	View Answer on Stackoverflow
Solution 6 - .Net	AnthonyWJones	View Answer on Stackoverflow
Solution 7 - .Net	Autodidact	View Answer on Stackoverflow
Solution 8 - .Net	Fredou	View Answer on Stackoverflow
Solution 9 - .Net	Michael Buen	View Answer on Stackoverflow
Solution 10 - .Net	Damien_The_Unbeliever	View Answer on Stackoverflow