Distinct not working with LINQ to Objects

C#.NetLinqIequatableIequalitycomparer

C# Problem Overview


class Program
{
    static void Main(string[] args)
    {
        List<Book> books = new List<Book> 
        {
            new Book
            {
                Name="C# in Depth",
                Authors = new List<Author>
                {
                    new Author 
                    {
                        FirstName = "Jon", LastName="Skeet"
                    },
                     new Author 
                    {
                        FirstName = "Jon", LastName="Skeet"
                    },                       
                }
            },
            new Book
            {
                Name="LINQ in Action",
                Authors = new List<Author>
                {
                    new Author 
                    {
                        FirstName = "Fabrice", LastName="Marguerie"
                    },
                     new Author 
                    {
                        FirstName = "Steve", LastName="Eichert"
                    },
                     new Author 
                    {
                        FirstName = "Jim", LastName="Wooley"
                    },
                }
            },
        };


        var temp = books.SelectMany(book => book.Authors).Distinct();
        foreach (var author in temp)
        {
            Console.WriteLine(author.FirstName + " " + author.LastName);
        }

        Console.Read();
    }

}
public class Book
{
    public string Name { get; set; }
    public List<Author> Authors { get; set; }
}
public class Author
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public override bool Equals(object obj)
    {
        return true;
        //if (obj.GetType() != typeof(Author)) return false;
        //else return ((Author)obj).FirstName == this.FirstName && ((Author)obj).FirstName == this.LastName;
    }

}

This is based on an example in "LINQ in Action". Listing 4.16.

This prints Jon Skeet twice. Why? I have even tried overriding Equals method in Author class. Still Distinct does not seem to work. What am I missing?

Edit: I have added == and != operator overload too. Still no help.

 public static bool operator ==(Author a, Author b)
    {
        return true;
    }
    public static bool operator !=(Author a, Author b)
    {
        return false;
    }

C# Solutions


Solution 1 - C#

LINQ Distinct is not that smart when it comes to custom objects.

All it does is look at your list and see that it has two different objects (it doesn't care that they have the same values for the member fields).

One workaround is to implement the IEquatable interface as shown here.

If you modify your Author class like so it should work.

public class Author : IEquatable<Author>
{
    public string FirstName { get; set; }
    public string LastName { get; set; }

    public bool Equals(Author other)
    {
        if (FirstName == other.FirstName && LastName == other.LastName)
            return true;

        return false;
    }

    public override int GetHashCode()
    {
        int hashFirstName = FirstName == null ? 0 : FirstName.GetHashCode();
        int hashLastName = LastName == null ? 0 : LastName.GetHashCode();

        return hashFirstName ^ hashLastName;
    }
}

Try it as DotNetFiddle

Solution 2 - C#

The Distinct() method checks reference equality for reference types. This means it is looking for literally the same object duplicated, not different objects which contain the same values.

There is an overload which takes an IEqualityComparer, so you can specify different logic for determining whether a given object equals another.

If you want Author to normally behave like a normal object (i.e. only reference equality), but for the purposes of Distinct check equality by name values, use an IEqualityComparer. If you always want Author objects to be compared based on the name values, then override GetHashCode and Equals, or implement IEquatable.

The two members on the IEqualityComparer interface are Equals and GetHashCode. Your logic for determining whether two Author objects are equal appears to be if the First and Last name strings are the same.

public class AuthorEquals : IEqualityComparer<Author>
{
    public bool Equals(Author left, Author right)
    {
        if((object)left == null && (object)right == null)
        {
            return true;
        }
        if((object)left == null || (object)right == null)
        {
            return false;
        }
        return left.FirstName == right.FirstName && left.LastName == right.LastName;
    }

    public int GetHashCode(Author author)
    {
        return (author.FirstName + author.LastName).GetHashCode();
    }
}

Solution 3 - C#

Another solution without implementing IEquatable, Equals and GetHashCode is to use the LINQs GroupBy method and to select the first item from the IGrouping.

var temp = books.SelectMany(book => book.Authors)
                .GroupBy (y => y.FirstName + y.LastName )
                .Select (y => y.First ());

foreach (var author in temp){
  Console.WriteLine(author.FirstName + " " + author.LastName);
}

Solution 4 - C#

There is one more way to get distinct values from list of user defined data type:

YourList.GroupBy(i => i.Id).Select(i => i.FirstOrDefault()).ToList();

Surely, it will give distinct set of data

Solution 5 - C#

Distinct() performs the default equality comparison on objects in the enumerable. If you have not overridden Equals() and GetHashCode(), then it uses the default implementation on object, which compares references.

The simple solution is to add a correct implementation of Equals() and GetHashCode() to all classes which participate in the object graph you are comparing (ie Book and Author).

The IEqualityComparer interface is a convenience that allows you to implement Equals() and GetHashCode() in a separate class when you don't have access to the internals of the classes you need to compare, or if you are using a different method of comparison.

Solution 6 - C#

You've overriden Equals(), but make sure you also override GetHashCode()

Solution 7 - C#

The Above answers are wrong!!! Distinct as stated on MSDN returns the default Equator which as stated The Default property checks whether type T implements the System.IEquatable interface and, if so, returns an EqualityComparer that uses that implementation. Otherwise, it returns an EqualityComparer that uses the overrides of Object.Equals and Object.GetHashCode provided by T

Which means as long as you overide Equals you are fine.

The reason you're code is not working is because you check firstname==lastname.

see https://msdn.microsoft.com/library/bb348436(v=vs.100).aspx and https://msdn.microsoft.com/en-us/library/ms224763(v=vs.100).aspx

Solution 8 - C#

You can achieve this several ways:

1. You may to implement the IEquatable interface as shown Enumerable.Distinct Method or you can see @skalb's answer at this post

2. If your object has not unique key, You can use GroupBy method for achive distinct object list, that you must group object's all properties and after select first object.

For example like as below and working for me:

var distinctList= list.GroupBy(x => new {
                            Name= x.Name,
                            Phone= x.Phone,
                            Email= x.Email,
                            Country= x.Country
                        }, y=> y)
                       .Select(x => x.First())
                       .ToList()

MyObject class is like as below:

public class MyClass{
       public string Name{get;set;}
       public string Phone{get;set;}
       public string Email{get;set;}
       public string Country{get;set;}
}

3. If your object's has unique key, you can only use the it in group by.

For example my object's unique key is Id.

var distinctList= list.GroupBy(x =>x.Id)
                      .Select(x => x.First())
                      .ToList()

Solution 9 - C#

You can use extension method on list which checks uniqueness based on computed Hash. You can also change extension method to support IEnumerable.

Example:

public class Employee{
public string Name{get;set;}
public int Age{get;set;}
}
 
List<Employee> employees = new List<Employee>();
employees.Add(new Employee{Name="XYZ", Age=30});
employees.Add(new Employee{Name="XYZ", Age=30});

employees = employees.Unique(); //Gives list which contains unique objects. 

Extension Method:

    public static class LinqExtension
        {
            public static List<T> Unique<T>(this List<T> input)
            {
                HashSet<string> uniqueHashes = new HashSet<string>();
                List<T> uniqueItems = new List<T>();
    
                input.ForEach(x =>
                {
                    string hashCode = ComputeHash(x);
    
                    if (uniqueHashes.Contains(hashCode))
                    {
                        return;
                    }
    
                    uniqueHashes.Add(hashCode);
                    uniqueItems.Add(x);
                });
    
                return uniqueItems;
            }
    
            private static string ComputeHash<T>(T entity)
            {
                System.Security.Cryptography.SHA1CryptoServiceProvider sh = new System.Security.Cryptography.SHA1CryptoServiceProvider();
                string input = JsonConvert.SerializeObject(entity);
    
                byte[] originalBytes = ASCIIEncoding.Default.GetBytes(input);
                byte[] encodedBytes = sh.ComputeHash(originalBytes);
    
                return BitConverter.ToString(encodedBytes).Replace("-", "");
            }

Solution 10 - C#

The Equal operator in below code is incorrect.

Old

public bool Equals(Author other)
{
    if (FirstName == other.FirstName && LastName == other.LastName)
        return true;

    return false;
}

NEW

public override bool Equals(Object obj)
{
    var other = obj as Author;

    if (other is null)
    {
        return false;
    }

    if (FirstName == other.FirstName && LastName == other.LastName)
        return true;

    return false;
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTanmoyView Question on Stackoverflow
Solution 1 - C#skalbView Answer on Stackoverflow
Solution 2 - C#Rex MView Answer on Stackoverflow
Solution 3 - C#JehofView Answer on Stackoverflow
Solution 4 - C#Ashu_90View Answer on Stackoverflow
Solution 5 - C#AndyMView Answer on Stackoverflow
Solution 6 - C#Eric KingView Answer on Stackoverflow
Solution 7 - C#AlexView Answer on Stackoverflow
Solution 8 - C#Ramil AliyevView Answer on Stackoverflow
Solution 9 - C#chindirala sampath kumarView Answer on Stackoverflow
Solution 10 - C#Anil DhandarView Answer on Stackoverflow