Performance: type derived from generic

C#.NetPerformanceGenericsClr

C# Problem Overview


I've encountered with one performance problem that I can't quite understand. I know how to fix it but I don't understand Why that happens. It's just for fun!
Let's talk code. I simplified the code as much as I could to reproduce the issue.
Suppose we have a generic class. It has an empty list inside and does something with T in constructor. It has Run method that calls an IEnumerable<T> method on the list, e.g. Any().

public class BaseClass<T>
{
    private List<T> _list = new List<T>();

    public BaseClass()
    {
        Enumerable.Empty<T>();
        // or Enumerable.Repeat(new T(), 10);
        // or even new T();
        // or foreach (var item in _list) {}
    }

    public void Run()
    {
        for (var i = 0; i < 8000000; i++)
        {
            if (_list.Any())
            // or if (_list.Count() > 0)
            // or if (_list.FirstOrDefault() != null)
            // or if (_list.SingleOrDefault() != null)
            // or other IEnumerable<T> method
            {
                return;
            }
        }
    }
}

Then we have a derived class which is empty:

public class DerivedClass : BaseClass<object>
{
}

Let's measure the performance of running ClassBase<T>.Run method from both classes. Accessing from derived type is 4 times slower that from base class. And I can't understand why that happens. Compiled in Release mode, result is the same with warm up. It happens on .NET 4.5 only.

public class Program
{
    public static void Main()
    {
        Measure(new DerivedClass());
        Measure(new BaseClass<object>());
    }

    private static void Measure(BaseClass<object> baseClass)
    {
        var sw = Stopwatch.StartNew();
        baseClass.Run();
        sw.Stop();
        Console.WriteLine(sw.ElapsedMilliseconds);
    }
}

Full listing on gist

C# Solutions


Solution 1 - C#

Update:
There's an answer from the CLR team on Microsoft Connect

> It is related to dictionary lookups in shared generics code. The heuristic in runtime and JIT do not work well for this particular test. We will take a look what can be done about it. > >In the meantime, you can workaround it by adding two dummy methods to the BaseClass (do not even need to be called). It will cause the heuristic to work as one would expect.

Original:
That's JIT fail.

Can be fixed by this crazy thing:

    public class BaseClass<T>
    {
        private List<T> _list = new List<T>();

        public BaseClass()
        {
            Enumerable.Empty<T>();
            // or Enumerable.Repeat(new T(), 10);
            // or even new T();
            // or foreach (var item in _list) {}
        }

        public void Run()
        {
            for (var i = 0; i < 8000000; i++)
            {
                if (_list.Any())
                {
                    return;
                }
            }
        }

        public void Run2()
        {
            for (var i = 0; i < 8000000; i++)
            {
                if (_list.Any())
                {
                    return;
                }
            }
        }

        public void Run3()
        {
            for (var i = 0; i < 8000000; i++)
            {
                if (_list.Any())
                {
                    return;
                }
            }
        }
    }

Note that Run2()/Run3() are not called from anywhere. But if you comment out Run2 or Run3 methods - you'll get performance penalty as before.

There's something related to stack alignment or to the size of method table, I guess.

P.S. You can replace

 Enumerable.Empty<T>();
 // with
 var x = new Func<IEnumerable<T>>(Enumerable.Empty<T>);

still the same bug.

Solution 2 - C#

After some experimentation, I found that Enumerable.Empty<T> is always slow when T is a class type; if it is a value type it is faster, but dependent on the struct size. I tested object, string, int, PointF, RectangleF, DateTime, Guid.

Looking at how it is implemented, I tried different alternatives, and found some that works fast.

Enumerable.Empty<T> relies on the internal class EmptyEnumerable<TElement>'s Instance static property. That property does little things:

  • Checks if a private static volatile field is null.

  • Assigns an empty array to the field once (only if empty).

  • Returns the value of the field.

Then, what Enumerable.Empty<T> is really doing is only return an empty array of T.

Trying different approaches, I found that the slowness is caused by both the property and the volatile modifier.

Adopting a static field initialized to T[0] instead of Enumerable.Empty<T> like

public static readonly T[] EmptyArray = new T[0];

the problem disappeared. Note that the readonly modifier is not determinant. Having the same static field declared with volatile or accessed through a property causes the problem.

Regards, Daniele.

Solution 3 - C#

It seems there is an CLR optimizator issue. Turn off the "Optimize Code" on the Build tab and try to run your test again.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAlexandr NikitinView Question on Stackoverflow
Solution 1 - C#SinixView Answer on Stackoverflow
Solution 2 - C#Rubidium 37View Answer on Stackoverflow
Solution 3 - C#Eugene BushtyrevView Answer on Stackoverflow