How, when and where are generic methods made concrete?

C#.NetGenerics

C# Problem Overview


This question got me wondering about where the concrete implementaiton of a generic method actually comes into existence. I've tried the google but am not coming up with the right search.

If we take this simple example:

class Program
{
    public static T GetDefault<T>()
    {
        return default(T);
    }

    static void Main(string[] args)
    {
        int i = GetDefault<int>();
        double d = GetDefault<double>();
        string s = GetDefault<string>();
    }
}

in my head I've always assumed that at some point it results in an implementation with the 3 necessary concrete implementations such that, in naive pseudo mangling, we would have this logical concrete implementaiton where the specific types used result in the correct stack allocations etc.

class Program
{
    static void Main(string[] args)
    {
        int i = GetDefaultSystemInt32();
        double d = GetDefaultSystemFloat64();
        string s = GetDefaultSystemString();
    }

    static int GetDefaultSystemInt32()
    {
        int i = 0;
        return i;
    }
    static double GetDefaultSystemFloat64()
    {
        double d = 0.0;
        return d;
    }
    static string GetDefaultSystemString()
    {
        string s = null;
        return s;
    }
}

Looking at the IL for the generic program it is still expressed in terms of generic types:

.method public hidebysig static !!T  GetDefault<T>() cil managed
{
  // Code size       15 (0xf)
  .maxstack  1
  .locals init ([0] !!T CS$1$0000,
           [1] !!T CS$0$0001)
  IL_0000:  nop
  IL_0001:  ldloca.s   CS$0$0001
  IL_0003:  initobj    !!T
  IL_0009:  ldloc.1
  IL_000a:  stloc.0
  IL_000b:  br.s       IL_000d
  IL_000d:  ldloc.0
  IL_000e:  ret
} // end of method Program::GetDefault

So how and at what point is it decided that an int, and then a double and then a string have to be allocated on the stack and returned to the caller? Is this an operation of the JIT process? Am I looking at this in the completely wrong light?

C# Solutions


Solution 1 - C#

In C#, the concepts of generic types and methods is supported by the runtime itself. The C# compiler does not need to actually create a concrete version of a generic method.

The actual "concrete" generic method is created at runtime by the JIT, and does not exist in the IL. The first time a generic method is used with a type, the JIT will see if it's been created, and if not, construct the appropriate method for that generic type.

This is one of the fundamental differences between generics and things like templates in C++. It's also the main reason for many of the limitations with generics - since the compiler isn't actually creating the runtime implementation for types, the interface restrictions are handled by compile time constraints, which make generics a bit more limiting than templates in C++ in terms of potential use cases. However, the fact that they are supported in the runtime itself allows creation of generic types and usage from libraries possible in ways that aren't supported in C++ and other compile-time created template implementations.

Solution 2 - C#

The actual machine code for a generic method is created, as always, when the method is jitted. At that point, the jitter first checks if a suitable candidate was jitted before. Which is very commonly the case, the code for a method whose concrete runtime type T is a reference type needs to be generated only once and is suitable for every possible reference type T. The constraints on T ensure that this machine code is always valid, previously checked by the C# compiler.

Additional copies may be generated for T's that are value types, their machine code is different because T values are not simple pointers anymore.

So yes, in your case you'll end up with three distinct methods. The <string> version would be useable for any reference type but you don't have others. And the <int> and <double> versions fit the "T's that are value types" category.

Otherwise an excellent example, the return values of these methods are passed back to the caller differently. On the x64 jitter, the string version returns the value with the RAX register, like any returned pointer value, the int version returns with the EAX register, the double version returns with the XMM0 register.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiondkackmanView Question on Stackoverflow
Solution 1 - C#Reed CopseyView Answer on Stackoverflow
Solution 2 - C#Hans PassantView Answer on Stackoverflow