Remove duplicates in the list using linq

C#LinqLinq to-ObjectsGeneric List

C# Problem Overview


I have a class Items with properties (Id, Name, Code, Price).

The List of Items is populated with duplicated items.

For ex.:

1         Item1       IT00001        $100
2         Item2       IT00002        $200
3         Item3       IT00003        $150
1         Item1       IT00001        $100
3         Item3       IT00003        $150

How to remove the duplicates in the list using linq?

C# Solutions


Solution 1 - C#

var distinctItems = items.GroupBy(x => x.Id).Select(y => y.First());

Solution 2 - C#

var distinctItems = items.Distinct();

To match on only some of the properties, create a custom equality comparer, e.g.:

class DistinctItemComparer : IEqualityComparer<Item> {

    public bool Equals(Item x, Item y) {
        return x.Id == y.Id &&
            x.Name == y.Name &&
            x.Code == y.Code &&
            x.Price == y.Price;
    }

    public int GetHashCode(Item obj) {
        return obj.Id.GetHashCode() ^
            obj.Name.GetHashCode() ^
            obj.Code.GetHashCode() ^
            obj.Price.GetHashCode();
    }
}

Then use it like this:

var distinctItems = items.Distinct(new DistinctItemComparer());

Solution 3 - C#

If there is something that is throwing off your Distinct query, you might want to look at MoreLinq and use the DistinctBy operator and select distinct objects by id.

var distinct = items.DistinctBy( i => i.Id );

Solution 4 - C#

This is how I was able to group by with Linq. Hope it helps.

var query = collection.GroupBy(x => x.title).Select(y => y.FirstOrDefault());

Solution 5 - C#

An universal extension method:

public static class EnumerableExtensions
{
    public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> enumerable, Func<T, TKey> keySelector)
    {
        return enumerable.GroupBy(keySelector).Select(grp => grp.First());
    }
}

Example of usage:

var lstDst = lst.DistinctBy(item => item.Key);

Solution 6 - C#

You have three option here for removing duplicate item in your List:

  1. Use a a custom equality comparer and then use Distinct(new DistinctItemComparer()) as @Christian Hayter mentioned.

  2. Use GroupBy, but please note in GroupBy you should Group by all of the columns because if you just group by Id it doesn't remove duplicate items always. For example consider the following example:

    List<Item> a = new List<Item>
    {
        new Item {Id = 1, Name = "Item1", Code = "IT00001", Price = 100},
        new Item {Id = 2, Name = "Item2", Code = "IT00002", Price = 200},
        new Item {Id = 3, Name = "Item3", Code = "IT00003", Price = 150},
        new Item {Id = 1, Name = "Item1", Code = "IT00001", Price = 100},
        new Item {Id = 3, Name = "Item3", Code = "IT00003", Price = 150},
        new Item {Id = 3, Name = "Item3", Code = "IT00004", Price = 250}
    };
    var distinctItems = a.GroupBy(x => x.Id).Select(y => y.First());
    

The result for this grouping will be:

    {Id = 1, Name = "Item1", Code = "IT00001", Price = 100}
    {Id = 2, Name = "Item2", Code = "IT00002", Price = 200}
    {Id = 3, Name = "Item3", Code = "IT00003", Price = 150}

Which is incorrect because it considers {Id = 3, Name = "Item3", Code = "IT00004", Price = 250} as duplicate. So the correct query would be:

    var distinctItems = a.GroupBy(c => new { c.Id , c.Name , c.Code , c.Price})
                         .Select(c => c.First()).ToList();

3.Override Equal and GetHashCode in item class:

    public class Item
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public string Code { get; set; }
        public int Price { get; set; }

        public override bool Equals(object obj)
        {
            if (!(obj is Item))
                return false;
            Item p = (Item)obj;
            return (p.Id == Id && p.Name == Name && p.Code == Code && p.Price == Price);
        }
        public override int GetHashCode()
        {
            return String.Format("{0}|{1}|{2}|{3}", Id, Name, Code, Price).GetHashCode();
        }
    }

Then you can use it like this:

    var distinctItems = a.Distinct();

Solution 7 - C#

Use Distinct() but keep in mind that it uses the default equality comparer to compare values, so if you want anything beyond that you need to implement your own comparer.

Please see http://msdn.microsoft.com/en-us/library/bb348436.aspx for an example.

Solution 8 - C#

Try this extension method out. Hopefully this could help.

public static class DistinctHelper
{
    public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
    {
        var identifiedKeys = new HashSet<TKey>();
        return source.Where(element => identifiedKeys.Add(keySelector(element)));
    }
}

Usage:

var outputList = sourceList.DistinctBy(x => x.TargetProperty);

Solution 9 - C#

List<Employee> employees = new List<Employee>()
{
    new Employee{Id =1,Name="AAAAA"}
    , new Employee{Id =2,Name="BBBBB"}
    , new Employee{Id =3,Name="AAAAA"}
    , new Employee{Id =4,Name="CCCCC"}
    , new Employee{Id =5,Name="AAAAA"}
};

List<Employee> duplicateEmployees = employees.Except(employees.GroupBy(i => i.Name)
                                             .Select(ss => ss.FirstOrDefault()))
                                            .ToList();

Solution 10 - C#

Another workaround, not beautiful buy workable.

I have an XML file with an element called "MEMDES" with two attribute as "GRADE" and "SPD" to record the RAM module information. There are lot of dupelicate items in SPD.

So here is the code I use to remove the dupelicated items:

        IEnumerable<XElement> MList =
            from RAMList in PREF.Descendants("MEMDES")
            where (string)RAMList.Attribute("GRADE") == "DDR4"
            select RAMList;

        List<string> sellist = new List<string>();

        foreach (var MEMList in MList)
        {
            sellist.Add((string)MEMList.Attribute("SPD").Value);
        }

        foreach (string slist in sellist.Distinct())
        {
            comboBox1.Items.Add(slist);
        }

Solution 11 - C#

When you don't want to write IEqualityComparer you can try something like following.

 class Program
{

    private static void Main(string[] args)
    {

        var items = new List<Item>();
        items.Add(new Item {Id = 1, Name = "Item1"});
        items.Add(new Item {Id = 2, Name = "Item2"});
        items.Add(new Item {Id = 3, Name = "Item3"});
        
        //Duplicate item
        items.Add(new Item {Id = 4, Name = "Item4"});
        //Duplicate item
        items.Add(new Item {Id = 2, Name = "Item2"});

        items.Add(new Item {Id = 3, Name = "Item3"});

        var res = items.Select(i => new {i.Id, i.Name})
            .Distinct().Select(x => new Item {Id = x.Id, Name = x.Name}).ToList();

        // now res contains distinct records
    }



}


public class Item
{
    public int Id { get; set; }

    public string Name { get; set; }
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionPrasadView Question on Stackoverflow
Solution 1 - C#FreddyView Answer on Stackoverflow
Solution 2 - C#Christian HayterView Answer on Stackoverflow
Solution 3 - C#tvanfossonView Answer on Stackoverflow
Solution 4 - C#Victor JuriView Answer on Stackoverflow
Solution 5 - C#TOLView Answer on Stackoverflow
Solution 6 - C#Salah AkbariView Answer on Stackoverflow
Solution 7 - C#Brian RasmussenView Answer on Stackoverflow
Solution 8 - C#Kent AguilarView Answer on Stackoverflow
Solution 9 - C#Arun KumarView Answer on Stackoverflow
Solution 10 - C#Rex HsuView Answer on Stackoverflow
Solution 11 - C#Kundan BhatiView Answer on Stackoverflow