Union Vs Concat in Linq

C#Linq

C# Problem Overview


I have a question on Union and Concat.

var a1 = (new[] { 1, 2 }).Union(new[] { 1, 2 });             // O/P : 1 2
var a2 = (new[] { 1, 2 }).Concat(new[] { 1, 2 });            // O/P : 1 2 1 2

var a3 = (new[] { "1", "2" }).Union(new[] { "1", "2" });     // O/P : "1" "2"
var a4 = (new[] { "1", "2" }).Concat(new[] { "1", "2" });    // O/P : "1" "2" "1" "2"

The above result are expected, but in the case of List<T> I am getting the same result from both Union and Concat.

class X
{
    public int ID { get; set; }
}

class X1 : X
{
    public int ID1 { get; set; }
}

class X2 : X
{
    public int ID2 { get; set; }
}

var lstX1 = new List<X1> { new X1 { ID = 10, ID1 = 10 }, new X1 { ID = 10, ID1 = 10 } };
var lstX2 = new List<X2> { new X2 { ID = 10, ID2 = 10 }, new X2 { ID = 10, ID2 = 10 } };
        
var a5 = lstX1.Cast<X>().Union(lstX2.Cast<X>());     // O/P : a5.Count() = 4
var a6 = lstX1.Cast<X>().Concat(lstX2.Cast<X>());    // O/P : a6.Count() = 4

But both are behaving the same incase of List<T>.

Any suggestions please?

C# Solutions


Solution 1 - C#

Union returns Distinct values. By default it will compare references of items. Your items have different references, thus they all are considered different. When you cast to base type X, reference is not changed.

If you will override Equals and GetHashCode (used to select distinct items), then items will not be compared by reference:

class X
{
    public int ID { get; set; }

    public override bool Equals(object obj)
    {
        X x = obj as X;
        if (x == null)
            return false;
        return x.ID == ID;
    }

    public override int GetHashCode()
    {
        return ID.GetHashCode();
    }
}

But all your items have different value of ID. So all items still considered different. If you will provide several items with same ID then you will see difference between Union and Concat:

var lstX1 = new List<X1> { new X1 { ID = 1, ID1 = 10 }, 
                           new X1 { ID = 10, ID1 = 100 } };
var lstX2 = new List<X2> { new X2 { ID = 1, ID2 = 20 }, // ID changed here
                           new X2 { ID = 20, ID2 = 200 } };

var a5 = lstX1.Cast<X>().Union(lstX2.Cast<X>());  // 3 distinct items
var a6 = lstX1.Cast<X>().Concat(lstX2.Cast<X>()); // 4

Your initial sample works, because integers are value types and they are compared by value.

Solution 2 - C#

Concat literally returns the items from the first sequence followed by the items from the second sequence. If you use Concat on two 2-item sequences, you will always get a 4-item sequence.

Union is essentially Concat followed by Distinct.

In your first two cases, you end up with 2-item sequences because, between them, each pair of input squences has exactly two distinct items.

In your third case, you end up with a 4-item sequence because all four items in your two input sequences are distinct.

Solution 3 - C#

Union and Concat behave the same since Union can not detect duplicates without a custom IEqualityComparer<X>. It's just looking if both are the same reference.

public class XComparer: IEqualityComparer<X>
{
    public bool Equals(X x1, X x2)
    {
        if (object.ReferenceEquals(x1, x2))
            return true;
        if (x1 == null || x2 == null)
            return false;
        return x1.ID.Equals(x2.ID);
    }

    public int GetHashCode(X x)
    {
        return x.ID.GetHashCode();
    }
}

Now you can use it in the overload of Union:

var comparer = new XComparer();
a5 = lstX1.Cast<X>().Union(lstX2.Cast<X>(), new XComparer()); 

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionPrasad KanaparthiView Question on Stackoverflow
Solution 1 - C#Sergey BerezovskiyView Answer on Stackoverflow
Solution 2 - C#RawlingView Answer on Stackoverflow
Solution 3 - C#Tim SchmelterView Answer on Stackoverflow