Why isn't Array a generic type?
C#GenericsTypesLanguage DesignC# Problem Overview
Array
is declared:
public abstract class Array
: ICloneable, IList, ICollection, IEnumerable {
I'm wondering why isn't it:
public partial class Array<T>
: ICloneable, IList<T>, ICollection<T>, IEnumerable<T> {
-
What would be the issue if it was declared as a generic type?
-
If it was a generic type, do we still need the non-generic one or could it derive from
Array<T>
? Such aspublic partial class Array: Array<object> {
C# Solutions
Solution 1 - C#
History
> What problems would arise if arrays became a generic type?
Back in C# 1.0 they copied the concept of arrays mainly from Java. Generics did not exist back then, but the creators thought they were smart and copied the broken covariant array semantics that Java arrays have. This means that you can pull off things like this without a compile-time error (but a runtime-error instead):
Mammoth[] mammoths = new Mammoth[10];
Animal[] animals = mammoths; // Covariant conversion
animals[1] = new Giraffe(); // Run-time exception
In C# 2.0 generics were introduced, but no covariant/contravariant generic types. If arrays were made generic, then you couldn't cast Mammoth[]
to Animal[]
, something you could do before (even though it was broken). So making arrays generic would've broken a lot of code.
Only in C# 4.0 were covariant/contravariant generic types for interfaces introduced. This made it possible to fix the broken array covariance once and for all. But again, this would've broken a lot of existing code.
Array<Mammoth> mammoths = new Array<Mammoth>(10);
Array<Animal> animals = mammoths; // Not allowed.
IEnumerable<Animals> animals = mammoths; // Covariant conversion
Arrays implement generic interfaces
> Why don't arrays implement the generic IList<T>
, ICollection<T>
and IEnumerable<T>
interfaces?
Thanks to a runtime trick every array T[]
does implement IEnumerable<T>
, ICollection<T>
and IList<T>
automatically.1 From the Array
class documentation:
> Single-dimensional arrays implement the IList<T>
, ICollection<T>
, IEnumerable<T>
, IReadOnlyList<T>
and IReadOnlyCollection<T>
generic interfaces. The implementations are provided to arrays at run time, and as a result, the generic interfaces do not appear in the declaration syntax for the Array class.
> Can you use all members of the interfaces implemented by arrays?
No. The documentation continues with this remark:
> The key thing to be aware of when you cast an array to one of these interfaces is that members which add, insert, or remove elements throw NotSupportedException
.
That's because (for example) ICollection<T>
has an Add
method, but you cannot add anything to an array. It will throw an exception. This is another example of an early design error in the .NET Framework that will get you exceptions thrown at you at run-time:
ICollection<Mammoth> collection = new Mammoth[10]; // Cast to interface type
collection.Add(new Mammoth()); // Run-time exception
And since ICollection<T>
is not covariant (for obvious reasons), you can't do this:
ICollection<Mammoth> mammoths = new Array<Mammoth>(10);
ICollection<Animal> animals = mammoths; // Not allowed
Of course there is now the covariant IReadOnlyCollection<T>
interface that is also implemented by arrays under the hood1, but it contains only Count
so it has limited uses.
Array
The base class > If arrays were generic, would we still need the non-generic Array
class?
In the early days we did. All arrays implement the non-generic IList
,
ICollection
and
IEnumerable
interfaces through their base class Array
. This was the only reasonable way to give all arrays specific methods and interfaces, and is the primary use of the Array
base class. You see the same choice for enums: they are value types but inherit members from Enum
; and delegates that inherit from MulticastDelegate
.
> Could the non-generic base class Array
be removed now that generics are supported?
Yes, the methods and interfaces shared by all arrays could be defined on the generic Array<T>
class if it ever came into existence. And then you could write, for example, Copy<T>(T[] source, T[] destination)
instead of Copy(Array source, Array destination)
with the added benefit of some type safety.
However, from an Object-Oriented Programming point of view it is nice to have a common non-generic base class Array
that can be used to refer to any array regardless of the type of its elements. Just like how IEnumerable<T>
inherits from IEnumerable
(which is still used in some LINQ methods).
> Could the Array
base class derive from Array<object>
?
No, that would create a circular dependency: Array<T> : Array : Array<object> : Array : ...
. Also, that would imply you could store any object in an array (after all, all arrays would ultimately inherit from type Array<object>
).
The future
> Could the new generic array type Array<T>
be added without impacting existing code too much?
No. While the syntax could be made to fit, the existing array covariance could not be used.
An array is a special type in .NET. It even has its own instructions in the Common Intermediate Language. If the .NET and C# designers ever decide to go down this road, they could make the T[]
syntax syntactic sugar for Array<T>
(just like how T?
is syntactic sugar for Nullable<T>
), and still use the special instructions and support that allocates arrays contiguously in memory.
However, you would lose the ability to cast arrays of Mammoth[]
to one of their base types Animal[]
, similar to how you can't cast List<Mammoth>
to List<Animal>
. But array covariance is broken anyway, and there are better alternatives.
> Alternatives to array covariance?
All arrays implement IList<T>
. If the IList<T>
interface were made into a proper covariant interface then you could cast any array Array<Mammoth>
(or any list for that matter) to an IList<Animal>
. However, this requires the IList<T>
interface to be rewritten to remove all methods that might change the underlying array:
interface IList<out T> : ICollection<T>
{
T this[int index] { get; }
int IndexOf(object value);
}
interface ICollection<out T> : IEnumerable<T>
{
int Count { get; }
bool Contains(object value);
}
(Note that the types of parameters on input positions cannot be T
as this would break covariance. However, object
is good enough for Contains
and IndexOf
, who would just return false
when passed an object of an incorrect type. And collections implementing these interfaces can provide their own generic IndexOf(T value)
and Contains(T value)
.)
Then you could do this:
Array<Mammoth> mammoths = new Array<Mammoth>(10);
IList<Animals> animals = mammoths; // Covariant conversion
There is even a small performance improvement because the runtime would not have to check whether an assigned value is type compatible with the real type of the array's elements when setting the value of an element of an array.
My stab at it
I took a stab at how such an Array<T>
type would work if it were implemented in C# and .NET, combined with the real covariant IList<T>
and ICollection<T>
interfaces described above, and it works quite nicely. I also added the invariant IMutableList<T>
and IMutableCollection<T>
interfaces to provide the mutation methods that my new IList<T>
and ICollection<T>
interfaces lack.
I built a simple collection library around it, and you can download the source code and compiled binaries from BitBucket, or install the NuGet package:
> M42.Collections – Specialized collections with more functionality, features and ease-of-use than the built-in .NET collection classes.
1) An array T[]
in .Net 4.5 implements through its base class Array
: ICloneable
, IList
, ICollection
, IEnumerable
, IStructuralComparable
, IStructuralEquatable
; and silently through the runtime: IList<T>
, ICollection<T>
, IEnumerable<T>
, IReadOnlyList<T>
, and IReadOnlyCollection<T>
.
Solution 2 - C#
[Update, new insights, it felt something was missing until now]
Regarding the earlier answer:
- Arrays are covariant like other types can be. You can implement things like 'object[] foo = new string[5];' with covariance, so that is not the reason.
- Compatibility is probably the reason for not reconsidering the design, but I argue this is also not the correct answer.
However, the other reason I can think of is because an array is the 'basic type' for a linear set of elements in memory. I've been thinking about using Array<T>, which is where you might also wonder why T is an Object and why this 'Object' even exists? In this scenario T[] is just what I consider another syntax for Array<T> which is covariant with Array. Since the types actually differ, I consider the two cases similar.
Note that both a basic Object and a basic Array are not requirements for an OO language. C++ is the perfect example for this. The caveat of not having a basic type for these basic constructs is not being able to work with arrays or objects using reflection. For objects you're used to making Foo
Solution 3 - C#
Compatibility. Array is a historic type that goes back to the time that there were no generics.
Today it would make sense to have Array
, then Array<T>
, then the specific class ;)
Solution 4 - C#
>Thus I'd like to know why it is not:
The reason is that generics were not present in the first version of C#.
> But I cannot figure out what would be the problem myself.
The problem is that it would break a huge amount of code that uses the Array
class. C# doesn't support multiple inheritance, so lines like this
Array ary = Array.Copy(.....);
int[] values = (int[])ary;
would be broken.
If MS were making C# and .NET all over again from scratch, then there probably would be no problem in making Array
a generic class, but that is not the reality.
Solution 5 - C#
In addition to the other issues people have mentioned, trying to add a generic Array<T>
would pose a few other difficulties:
-
Even if today's covariance features had existed from the moment generics were introduced, they wouldn't have been sufficient for arrays. A routine which is designed to sort a
Car[]
will be able to sort aBuick[]
, even if it has to copy elements from the array into elements of typeCar
and then copy them back. The copying of the element from typeCar
back to aBuick[]
isn't really type-safe, but it's useful. One could define a covariant array single-dimensional-array interface in such a way as to make sorting possible [e.g. by including a `Swap(int firstIndex, int secondIndex) method], but it would be difficult to make something that's as flexible as arrays are. -
While an
Array<T>
type might work well for aT[]
, there would be no means within the generic type system to define a family that would includeT[]
,T[,]
,T[,,]
,T[,,,]
, etc. for an arbitrary number of subscripts. -
There is no means in .net to express the notion that two types should be considered identical, such that a variable of type
T1
can be copied to one of typeT2
, and vice versa, with both variables holding references to the same object. Someone using anArray<T>
type would probably want to be able to pass instances to code which expectsT[]
, and accept instances from code which usesT[]
. If old-style arrays couldn't be passed to and from code that uses the new style, then the new-style arrays would be more of an obstacle than a feature.
There might be ways of jinxing the type system to allow for a type Array<T>
that behaved as it should, but such a type would behave in many ways that were totally different from other generic types, and since there is already a type which implements the desired behavior (i.e. T[]
), it's not clear what benefits would accrue from defining another.
Solution 6 - C#
As everyone says - original Array
is non-generic because there was no generics when it came into existence in v1. Speculation below...
To make "Array" generic (which would make sense now) you can either
-
keep existing
Array
and add generic version. This is nice, but most usages of "Array" involve growing it over time and it most likely reason that better implementation of the same conceptList<T>
was implemented instead. At this point adding generic version of "sequential list of elements that does not grow" does not look very appealing. -
remove non-generic
Array
and replace with genericArray<T>
implementation with the same interface. Now you have to make compiled code for older versions to work with new type instead of existingArray
type. While it would be possible (also most likely hard) for framework code to support such migration, there is always a lot of code that written by other people.As
Array
is very basic type pretty much every piece of existing code (which includes custom code with reflection and marshalling to with native code and COM) uses it. As result price of even tiny incompatibility between versions (1.x -> 2.x of .Net Framework) would be very high.
So as result Array
type is there to stay forever. We now have List<T>
as generic equivalent to be used.
Solution 7 - C#
Maybe I'm missing something but unless the array instance is casted to or used as an ICollection, IEnumerable, etc.. then you don't gain anything with an array of T.
Arrays are fast and are already type safe and don't incur any boxing/unboxing overhead.