C# Determine Duplicate in List
C#LinqAlgorithmListGenericsC# Problem Overview
Requirement: In an unsorted List, determine if a duplicate exists. The typical way I would do this is an n-squared nested loop. I'm wondering how others solve this. Is there an elegant, high performance method in Linq? Something generic that takes a lambda or a comparer would be nice.
C# Solutions
Solution 1 - C#
Unless I'm missing something, then you should be able to get away with something simple using Distinct()
. Granted it won't be the most complex implementation you could come up with, but it will tell you if any duplicates get removed:
var list = new List<string>();
// Fill the list
if(list.Count != list.Distinct().Count())
{
// Duplicates exist
}
Solution 2 - C#
According to Eric White's article on how to Find Duplicates using LINQ:
> An easy way to find duplicates is to write a query that groups by the identifier, and then filter for groups that have more than one member. In the following example, we want to know that 4 and 3 are duplicates: > > > > int[] listOfItems = new[] { 4, 2, 3, 1, 6, 4, 3 }; > var duplicates = listOfItems > .GroupBy(i => i) > .Where(g => g.Count() > 1) > .Select(g => g.Key); > foreach (var d in duplicates) > Console.WriteLine(d); // 4,3
Solution 3 - C#
In order to allow short circuiting if the duplicate exists early in the list, you can add a HashSet<T>
and check the return value of its .Add
method.
By using .Any
you can short circuit the enumeration as soon as you find a duplicate.
Here's a LINQ extension method in both C# and VB:
CSharp:
public static bool ContainsDuplicates<T>(this IEnumerable<T> enumerable)
{
var knownKeys = new HashSet<T>();
return enumerable.Any(item => !knownKeys.Add(item));
}
Visual Basic:
<Extension>
Public Function ContainsDuplicates(Of T)(ByVal enumerable As IEnumerable(Of T)) As Boolean
Dim knownKeys As New HashSet(Of T)
Return enumerable.Any(Function(item) Not knownKeys.Add(item))
End Function
Note: to check if there are no duplicates, just change Any
to All
Solution 4 - C#
Place all items in a set and if the count of the set is different from the count of the list then there is a duplicate.
bool hasDuplicates<T>(List<T> myList) {
var hs = new HashSet<T>();
for (var i = 0; i < myList.Count; ++i) {
if (!hs.Add(myList[i])) return true;
}
return false;
}
Should be more efficient than Distinct as there is no need to go through all the list.
Solution 5 - C#
You can use IEnumerable
var list = new List<string> {"1", "2","3", "1", "2"};
var hasDuplicates = list.GroupBy(x => x).Any(x => x.Skip(1).Any());
Solution 6 - C#
Something along these lines is relatively simple and will provide you with a count of duplicates.
var something = new List<string>() { "One", "One", "Two", "Three" };
var dictionary = new Dictionary<string, int>();
something.ForEach(s =>
{
if (dictionary.ContainsKey(s))
{
dictionary[s]++;
}
else
{
dictionary[s] = 1;
}
});
I imagine this is similar to the implementation of Distinct, although I'm not certain.
Solution 7 - C#
You could use the Distinct() extension method for IEnumerable
Solution 8 - C#
If you are using integers or well ordered sets, use a binary tree for O(nlog n) performance.
Alternatively, find another faster means of sorting, then simply check that every value is different than the previous one.
Solution 9 - C#
Use Enumerable.Any
with HashSet.Add
like:
List<string> list = new List<string> {"A", "A", "B", "C", "D"};
HashSet<string> hashSet = new HashSet<string>();
if(list.Any(r => !hashSet.Add(r)))
{
//duplicate exists.
}
HashSet.Add
would return false
if the item already exist in the HashSet
. This will not iterate the whole list.
Solution 10 - C#
You could use Distinct()
statement to find unique records. Then compare with original generic list like this:
if (dgCoil.ItemsSource.Cast<BLL.Coil>().ToList().Count != dgCoil.ItemsSource.Cast<BLL.Coil>().Select(c => c.CoilNo).Distinct().Count())
{
//Duplicate detected !!
return;
}
Solution 11 - C#
Not seen anybody do this yet so here is a little program I just wrote. It's simple enough. Using Contains(), though I don't know how scalable this method is.
Console.WriteLine("Please enter 5 unique numbers....");
List<int> uniqueNums = new List<int>() { };
while (uniqueNums.Count < 5)
{
int input = Convert.ToInt32(Console.ReadLine());
if (uniqueNums.Contains(input))
{
Console.WriteLine("Add a different number");
}
uniqueNums.Add(input);
}
uniqueNums.Sort();
foreach (var n in uniqueNums)
{
Console.WriteLine(n);
}