Wrap a delegate in an IEqualityComparer

.NetLinqDelegates

.Net Problem Overview


Several Linq.Enumerable functions take an IEqualityComparer<T>. Is there a convenient wrapper class that adapts a delegate(T,T)=>bool to implement IEqualityComparer<T>? It's easy enough to write one (if your ignore problems with defining a correct hashcode), but I'd like to know if there is an out-of-the-box solution.

Specifically, I want to do set operations on Dictionarys, using only the Keys to define membership (while retaining the values according to different rules).

.Net Solutions


Solution 1 - .Net

On the importance of GetHashCode

Others have already commented on the fact that any custom IEqualityComparer<T> implementation should really include a GetHashCode method; but nobody's bothered to explain why in any detail.

Here's why. Your question specifically mentions the LINQ extension methods; nearly all of these rely on hash codes to work properly, because they utilize hash tables internally for efficiency.

Take Distinct, for example. Consider the implications of this extension method if all it utilized were an Equals method. How do you determine whether an item's already been scanned in a sequence if you only have Equals? You enumerate over the entire collection of values you've already looked at and check for a match. This would result in Distinct using a worst-case O(N2) algorithm instead of an O(N) one!

Fortunately, this isn't the case. Distinct doesn't just use Equals; it uses GetHashCode as well. In fact, it absolutely does not work properly without an IEqualityComparer<T> that supplies a proper GetHashCode. Below is a contrived example illustrating this.

Say I have the following type:

class Value
{
    public string Name { get; private set; }
    public int Number { get; private set; }
    
    public Value(string name, int number)
    {
        Name = name;
        Number = number;
    }
    
    public override string ToString()
    {
        return string.Format("{0}: {1}", Name, Number);
    }
}

Now say I have a List<Value> and I want to find all of the elements with a distinct name. This is a perfect use case for Distinct using a custom equality comparer. So let's use the Comparer<T> class from Aku's answer:

var comparer = new Comparer<Value>((x, y) => x.Name == y.Name);

Now, if we have a bunch of Value elements with the same Name property, they should all collapse into one value returned by Distinct, right? Let's see...

var values = new List<Value>();

var random = new Random();
for (int i = 0; i < 10; ++i)
{
    values.Add("x", random.Next());
}

var distinct = values.Distinct(comparer);

foreach (Value x in distinct)
{
    Console.WriteLine(x);
}

Output:

x: 1346013431
x: 1388845717
x: 1576754134
x: 1104067189
x: 1144789201
x: 1862076501
x: 1573781440
x: 646797592
x: 655632802
x: 1206819377

Hmm, that didn't work, did it?

What about GroupBy? Let's try that:

var grouped = values.GroupBy(x => x, comparer);

foreach (IGrouping<Value> g in grouped)
{
    Console.WriteLine("[KEY: '{0}']", g);
    foreach (Value x in g)
    {
        Console.WriteLine(x);
    }
}

Output:

[KEY = 'x: 1346013431']
x: 1346013431
[KEY = 'x: 1388845717']
x: 1388845717
[KEY = 'x: 1576754134']
x: 1576754134
[KEY = 'x: 1104067189']
x: 1104067189
[KEY = 'x: 1144789201']
x: 1144789201
[KEY = 'x: 1862076501']
x: 1862076501
[KEY = 'x: 1573781440']
x: 1573781440
[KEY = 'x: 646797592']
x: 646797592
[KEY = 'x: 655632802']
x: 655632802
[KEY = 'x: 1206819377']
x: 1206819377

Again: didn't work.

If you think about it, it would make sense for Distinct to use a HashSet<T> (or equivalent) internally, and for GroupBy to use something like a Dictionary<TKey, List<T>> internally. Could this explain why these methods don't work? Let's try this:

var uniqueValues = new HashSet<Value>(values, comparer);

foreach (Value x in uniqueValues)
{
    Console.WriteLine(x);
}

Output:

x: 1346013431
x: 1388845717
x: 1576754134
x: 1104067189
x: 1144789201
x: 1862076501
x: 1573781440
x: 646797592
x: 655632802
x: 1206819377

Yeah... starting to make sense?

Hopefully from these examples it's clear why including an appropriate GetHashCode in any IEqualityComparer<T> implementation is so important.


Original answer

Expanding on orip's answer:

There are a couple of improvements that can be made here.

  1. First, I'd take a Func<T, TKey> instead of Func<T, object>; this will prevent boxing of value type keys in the actual keyExtractor itself.
  2. Second, I'd actually add a where TKey : IEquatable<TKey> constraint; this will prevent boxing in the Equals call (object.Equals takes an object parameter; you need an IEquatable<TKey> implementation to take a TKey parameter without boxing it). Clearly this may pose too severe a restriction, so you could make a base class without the constraint and a derived class with it.

Here's what the resulting code might look like:

public class KeyEqualityComparer<T, TKey> : IEqualityComparer<T>
{
    protected readonly Func<T, TKey> keyExtractor;

    public KeyEqualityComparer(Func<T, TKey> keyExtractor)
    {
        this.keyExtractor = keyExtractor;
    }

    public virtual bool Equals(T x, T y)
    {
        return this.keyExtractor(x).Equals(this.keyExtractor(y));
    }

    public int GetHashCode(T obj)
    {
        return this.keyExtractor(obj).GetHashCode();
    }
}

public class StrictKeyEqualityComparer<T, TKey> : KeyEqualityComparer<T, TKey>
    where TKey : IEquatable<TKey>
{
    public StrictKeyEqualityComparer(Func<T, TKey> keyExtractor)
        : base(keyExtractor)
    { }

    public override bool Equals(T x, T y)
    {
        // This will use the overload that accepts a TKey parameter
        // instead of an object parameter.
        return this.keyExtractor(x).Equals(this.keyExtractor(y));
    }
}

Solution 2 - .Net

When you want to customize equality checking, 99% of the time you're interested in defining the keys to compare by, not the comparison itself.

This could be an elegant solution (concept from Python's http://wiki.python.org/moin/HowTo/Sorting#Sortingbykeys">list sort method).

Usage:

var foo = new List<string> { "abc", "de", "DE" };

// case-insensitive distinct
var distinct = foo.Distinct(new KeyEqualityComparer<string>( x => x.ToLower() ) );

The KeyEqualityComparer class:

public class KeyEqualityComparer<T> : IEqualityComparer<T>
{
    private readonly Func<T, object> keyExtractor;

    public KeyEqualityComparer(Func<T,object> keyExtractor)
    {
        this.keyExtractor = keyExtractor;
    }

    public bool Equals(T x, T y)
    {
        return this.keyExtractor(x).Equals(this.keyExtractor(y));
    }

    public int GetHashCode(T obj)
    {
        return this.keyExtractor(obj).GetHashCode();
    }
}

Solution 3 - .Net

I'm afraid there is no such wrapper out-of-box. However it's not hard to create one:

class Comparer<T>: IEqualityComparer<T>
{
    private readonly Func<T, T, bool> _comparer;

    public Comparer(Func<T, T, bool> comparer)
    {
        if (comparer == null)
            throw new ArgumentNullException("comparer");

        _comparer = comparer;
    }

    public bool Equals(T x, T y)
    {
        return _comparer(x, y);
    }

    public int GetHashCode(T obj)
    {
        return obj.ToString().ToLower().GetHashCode();
    }
}

...

Func<int, int, bool> f = (x, y) => x == y;
var comparer = new Comparer<int>(f);
Console.WriteLine(comparer.Equals(1, 1));
Console.WriteLine(comparer.Equals(1, 2));

Solution 4 - .Net

Ordinarily, I'd get this resolved by commenting @Sam on the answer (I've done some editing on the original post to clean it up a bit without altering the behavior.)

The following is my riff of @Sam's answer, with a [IMNSHO] critical fix to the default hashing policy:-

class FuncEqualityComparer<T> : IEqualityComparer<T>
{
    readonly Func<T, T, bool> _comparer;
    readonly Func<T, int> _hash;

    public FuncEqualityComparer( Func<T, T, bool> comparer )
        : this( comparer, t => 0 ) // NB Cannot assume anything about how e.g., t.GetHashCode() interacts with the comparer's behavior
    {
    }

    public FuncEqualityComparer( Func<T, T, bool> comparer, Func<T, int> hash )
    {
        _comparer = comparer;
        _hash = hash;
    }

    public bool Equals( T x, T y )
    {
        return _comparer( x, y );
    }

    public int GetHashCode( T obj )
    {
        return _hash( obj );
    }
}

Solution 5 - .Net

Same as Dan Tao's answer, but with a few improvements:

  1. Relies on EqualityComparer<>.Default to do the actual comparing so that it avoids boxing for value types (structs) that has implemented IEquatable<>.

  2. Since EqualityComparer<>.Default used it doesn't explode on null.Equals(something).

  3. Provided static wrapper around IEqualityComparer<> which will have a static method to create the instance of comparer - eases calling. Compare

     Equality<Person>.CreateComparer(p => p.ID);
    

    with

     new EqualityComparer<Person, int>(p => p.ID);
    
  4. Added an overload to specify IEqualityComparer<> for the key.

The class:

public static class Equality<T>
{
    public static IEqualityComparer<T> CreateComparer<V>(Func<T, V> keySelector)
    {
        return CreateComparer(keySelector, null);
    }

    public static IEqualityComparer<T> CreateComparer<V>(Func<T, V> keySelector, 
                                                         IEqualityComparer<V> comparer)
    {
        return new KeyEqualityComparer<V>(keySelector, comparer);
    }

    class KeyEqualityComparer<V> : IEqualityComparer<T>
    {
        readonly Func<T, V> keySelector;
        readonly IEqualityComparer<V> comparer;

        public KeyEqualityComparer(Func<T, V> keySelector, 
                                   IEqualityComparer<V> comparer)
        {
            if (keySelector == null)
                throw new ArgumentNullException("keySelector");

            this.keySelector = keySelector;
            this.comparer = comparer ?? EqualityComparer<V>.Default;
        }

        public bool Equals(T x, T y)
        {
            return comparer.Equals(keySelector(x), keySelector(y));
        }

        public int GetHashCode(T obj)
        {
            return comparer.GetHashCode(keySelector(obj));
        }
    }
}

you may use it like this:

var comparer1 = Equality<Person>.CreateComparer(p => p.ID);
var comparer2 = Equality<Person>.CreateComparer(p => p.Name);
var comparer3 = Equality<Person>.CreateComparer(p => p.Birthday.Year);
var comparer4 = Equality<Person>.CreateComparer(p => p.Name, StringComparer.CurrentCultureIgnoreCase);

Person is a simple class:

class Person
{
    public int ID { get; set; }
    public string Name { get; set; }
    public DateTime Birthday { get; set; }
}

Solution 6 - .Net

public class FuncEqualityComparer<T> : IEqualityComparer<T>
{
    readonly Func<T, T, bool> _comparer;
    readonly Func<T, int> _hash;

    public FuncEqualityComparer( Func<T, T, bool> comparer )
        : this( comparer, t => t.GetHashCode())
    {
    }

    public FuncEqualityComparer( Func<T, T, bool> comparer, Func<T, int> hash )
    {
        _comparer = comparer;
        _hash = hash;
    }

    public bool Equals( T x, T y )
    {
        return _comparer( x, y );
    }

    public int GetHashCode( T obj )
    {
        return _hash( obj );
    }
}

With extensions :-

public static class SequenceExtensions
{
    public static bool SequenceEqual<T>( this IEnumerable<T> first, IEnumerable<T> second, Func<T, T, bool> comparer )
    {
        return first.SequenceEqual( second, new FuncEqualityComparer<T>( comparer ) );
    }

    public static bool SequenceEqual<T>( this IEnumerable<T> first, IEnumerable<T> second, Func<T, T, bool> comparer, Func<T, int> hash )
    {
        return first.SequenceEqual( second, new FuncEqualityComparer<T>( comparer, hash ) );
    }
}

Solution 7 - .Net

orip's answer is great.

Here a little extension method to make it even easier:

public static IEnumerable<T> Distinct<T>(this IEnumerable<T> list, Func<T, object>    keyExtractor)
{
    return list.Distinct(new KeyEqualityComparer<T>(keyExtractor));
}
var distinct = foo.Distinct(x => x.ToLower())

Solution 8 - .Net

I'm going to answer my own question. To treat Dictionaries as sets, the simplest method seems to be to apply set operations to dict.Keys, then convert back to Dictionaries with Enumerable.ToDictionary(...).

Solution 9 - .Net

The implementation at (german text) Implementing IEqualityCompare with lambda expression cares about null values and uses extension methods to generate IEqualityComparer.

To create an IEqualityComparer in a Linq union your just have to write

persons1.Union(persons2, person => person.LastName)

The comparer:

public class LambdaEqualityComparer<TSource, TComparable> : IEqualityComparer<TSource>
{
  Func<TSource, TComparable> _keyGetter;
    
  public LambdaEqualityComparer(Func<TSource, TComparable> keyGetter)
  {
    _keyGetter = keyGetter;
  }
    
  public bool Equals(TSource x, TSource y)
  {
    if (x == null || y == null) return (x == null && y == null);
    return object.Equals(_keyGetter(x), _keyGetter(y));
  }
   
  public int GetHashCode(TSource obj)
  {
    if (obj == null) return int.MinValue;
    var k = _keyGetter(obj);
    if (k == null) return int.MaxValue;
    return k.GetHashCode();
  }
}

You also need to add an extension method to support type inference

public static class LambdaEqualityComparer
{
       // source1.Union(source2, lambda)
        public static IEnumerable<TSource> Union<TSource, TComparable>(
           this IEnumerable<TSource> source1, 
           IEnumerable<TSource> source2, 
            Func<TSource, TComparable> keySelector)
        {
            return source1.Union(source2, 
               new LambdaEqualityComparer<TSource, TComparable>(keySelector));
       }
   }

Solution 10 - .Net

Just one optimization: We can use the out-of-the-box EqualityComparer for value comparisions, rather than delegating it.

This would also make the implementation cleaner as actual comparision logic now stays in GetHashCode() and Equals() which you may have already overloaded.

Here is the code:

public class MyComparer<T> : IEqualityComparer<T> 
{ 
  public bool Equals(T x, T y) 
  { 
    return EqualityComparer<T>.Default.Equals(x, y); 
  } 
 
  public int GetHashCode(T obj) 
  { 
    return obj.GetHashCode(); 
  } 
} 

Don't forget to overload GetHashCode() and Equals() methods on your object.

This post helped me: https://stackoverflow.com/questions/488250/c-compare-two-generic-values

Sushil

Solution 11 - .Net

orip's answer is great. Expanding on orip's answer:

i think that the solution's key is use "Extension Method" to transfer the "anonymous type".

    public static class Comparer 
    {
      public static IEqualityComparer<T> CreateComparerForElements<T>(this IEnumerable<T> enumerable, Func<T, object> keyExtractor)
      {
        return new KeyEqualityComparer<T>(keyExtractor);
      }
    }

Usage:

var n = ItemList.Select(s => new { s.Vchr, s.Id, s.Ctr, s.Vendor, s.Description, s.Invoice }).ToList();
n.AddRange(OtherList.Select(s => new { s.Vchr, s.Id, s.Ctr, s.Vendor, s.Description, s.Invoice }).ToList(););
n = n.Distinct(x=>new{Vchr=x.Vchr,Id=x.Id}).ToList();
 

Solution 12 - .Net

public static Dictionary<TKey, TValue> Distinct<TKey, TValue>(this IEnumerable<TValue> items, Func<TValue, TKey> selector)
  {
     Dictionary<TKey, TValue> result = null;
     ICollection collection = items as ICollection;
     if (collection != null)
        result = new Dictionary<TKey, TValue>(collection.Count);
     else
        result = new Dictionary<TKey, TValue>();
     foreach (TValue item in items)
        result[selector(item)] = item;
     return result;
  }

This makes it possible to select a property with lambda like this: .Select(y => y.Article).Distinct(x => x.ArticleID);

Solution 13 - .Net

I don't know of an existing class but something like:

public class MyComparer<T> : IEqualityComparer<T>
{
  private Func<T, T, bool> _compare;
  MyComparer(Func<T, T, bool> compare)
  {
    _compare = compare;
  }

  public bool Equals(T x, Ty)
  {
    return _compare(x, y);
  }

  public int GetHashCode(T obj)
  {
    return obj.GetHashCode();
  }
}

Note: I haven't actually compiled and run this yet, so there might be a typo or other bug.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMarcelo CantosView Question on Stackoverflow
Solution 1 - .NetDan TaoView Answer on Stackoverflow
Solution 2 - .NetoripView Answer on Stackoverflow
Solution 3 - .NetakuView Answer on Stackoverflow
Solution 4 - .NetRuben BartelinkView Answer on Stackoverflow
Solution 5 - .Netldp615View Answer on Stackoverflow
Solution 6 - .NetSamView Answer on Stackoverflow
Solution 7 - .NetBrunoView Answer on Stackoverflow
Solution 8 - .NetMarcelo CantosView Answer on Stackoverflow
Solution 9 - .NetFriedView Answer on Stackoverflow
Solution 10 - .NetSushilView Answer on Stackoverflow
Solution 11 - .NetmatrixView Answer on Stackoverflow
Solution 12 - .NetMaxView Answer on Stackoverflow
Solution 13 - .NetGreggView Answer on Stackoverflow