Union Vs Concat in Linq
C#LinqC# Problem Overview
I have a question on Union
and Concat
.
var a1 = (new[] { 1, 2 }).Union(new[] { 1, 2 }); // O/P : 1 2
var a2 = (new[] { 1, 2 }).Concat(new[] { 1, 2 }); // O/P : 1 2 1 2
var a3 = (new[] { "1", "2" }).Union(new[] { "1", "2" }); // O/P : "1" "2"
var a4 = (new[] { "1", "2" }).Concat(new[] { "1", "2" }); // O/P : "1" "2" "1" "2"
The above result are expected, but in the case of List<T>
I am getting the same result from both Union
and Concat
.
class X
{
public int ID { get; set; }
}
class X1 : X
{
public int ID1 { get; set; }
}
class X2 : X
{
public int ID2 { get; set; }
}
var lstX1 = new List<X1> { new X1 { ID = 10, ID1 = 10 }, new X1 { ID = 10, ID1 = 10 } };
var lstX2 = new List<X2> { new X2 { ID = 10, ID2 = 10 }, new X2 { ID = 10, ID2 = 10 } };
var a5 = lstX1.Cast<X>().Union(lstX2.Cast<X>()); // O/P : a5.Count() = 4
var a6 = lstX1.Cast<X>().Concat(lstX2.Cast<X>()); // O/P : a6.Count() = 4
But both are behaving the same incase of List<T>
.
Any suggestions please?
C# Solutions
Solution 1 - C#
Union returns Distinct
values. By default it will compare references of items. Your items have different references, thus they all are considered different. When you cast to base type X
, reference is not changed.
If you will override Equals
and GetHashCode
(used to select distinct items), then items will not be compared by reference:
class X
{
public int ID { get; set; }
public override bool Equals(object obj)
{
X x = obj as X;
if (x == null)
return false;
return x.ID == ID;
}
public override int GetHashCode()
{
return ID.GetHashCode();
}
}
But all your items have different value of ID
. So all items still considered different. If you will provide several items with same ID
then you will see difference between Union
and Concat
:
var lstX1 = new List<X1> { new X1 { ID = 1, ID1 = 10 },
new X1 { ID = 10, ID1 = 100 } };
var lstX2 = new List<X2> { new X2 { ID = 1, ID2 = 20 }, // ID changed here
new X2 { ID = 20, ID2 = 200 } };
var a5 = lstX1.Cast<X>().Union(lstX2.Cast<X>()); // 3 distinct items
var a6 = lstX1.Cast<X>().Concat(lstX2.Cast<X>()); // 4
Your initial sample works, because integers are value types and they are compared by value.
Solution 2 - C#
Concat
literally returns the items from the first sequence followed by the items from the second sequence. If you use Concat
on two 2-item sequences, you will always get a 4-item sequence.
Union
is essentially Concat
followed by Distinct
.
In your first two cases, you end up with 2-item sequences because, between them, each pair of input squences has exactly two distinct items.
In your third case, you end up with a 4-item sequence because all four items in your two input sequences are distinct.
Solution 3 - C#
Union
and Concat
behave the same since Union
can not detect duplicates without a custom IEqualityComparer<X>
. It's just looking if both are the same reference.
public class XComparer: IEqualityComparer<X>
{
public bool Equals(X x1, X x2)
{
if (object.ReferenceEquals(x1, x2))
return true;
if (x1 == null || x2 == null)
return false;
return x1.ID.Equals(x2.ID);
}
public int GetHashCode(X x)
{
return x.ID.GetHashCode();
}
}
Now you can use it in the overload of Union
:
var comparer = new XComparer();
a5 = lstX1.Cast<X>().Union(lstX2.Cast<X>(), new XComparer());