GetHashCode override of object containing generic array

C#ArraysGenericsHashcode

C# Problem Overview


I have a class that contains the following two properties:

public int Id      { get; private set; }
public T[] Values  { get; private set; }

I have made it IEquatable<T> and overriden the object.Equals like this:

public override bool Equals(object obj)
{
    return Equals(obj as SimpleTableRow<T>);
}

public bool Equals(SimpleTableRow<T> other)
{
    // Check for null
    if(ReferenceEquals(other, null))
        return false;

    // Check for same reference
    if(ReferenceEquals(this, other))
        return true;

    // Check for same Id and same Values
    return Id == other.Id && Values.SequenceEqual(other.Values);
}

When having override object.Equals I must also override GetHashCode of course. But what code should I implement? How do I create a hashcode out of a generic array? And how do I combine it with the Id integer?

public override int GetHashCode()
{
    return // What?
}

C# Solutions


Solution 1 - C#

Because of the problems raised in this thread, I'm posting another reply showing what happens if you get it wrong... mainly, that you can't use the array's GetHashCode(); the correct behaviour is that no warnings are printed when you run it... switch the comments to fix it:

using System;
using System.Collections.Generic;
using System.Linq;
static class Program
{
    static void Main()
    {
        // first and second are logically equivalent
        SimpleTableRow<int> first = new SimpleTableRow<int>(1, 2, 3, 4, 5, 6),
            second = new SimpleTableRow<int>(1, 2, 3, 4, 5, 6);

        if (first.Equals(second) && first.GetHashCode() != second.GetHashCode())
        { // proven Equals, but GetHashCode() disagrees
            Console.WriteLine("We have a problem");
        }
        HashSet<SimpleTableRow<int>> set = new HashSet<SimpleTableRow<int>>();
        set.Add(first);
        set.Add(second);
        // which confuses anything that uses hash algorithms
        if (set.Count != 1) Console.WriteLine("Yup, very bad indeed");
    }
}
class SimpleTableRow<T> : IEquatable<SimpleTableRow<T>>
{
    
    public SimpleTableRow(int id, params T[] values) {
        this.Id = id;
        this.Values = values;
    }
    public int Id { get; private set; }
    public T[] Values { get; private set; }
    
    public override int GetHashCode() // wrong
    {
        return Id.GetHashCode() ^ Values.GetHashCode();
    }
    /*
    public override int GetHashCode() // right
    {
        int hash = Id;
        if (Values != null)
        {
            hash = (hash * 17) + Values.Length;
            foreach (T t in Values)
            {
                hash *= 17;
                if (t != null) hash = hash + t.GetHashCode();
            }
        }
        return hash;
    }
    */
    public override bool Equals(object obj)
    {
        return Equals(obj as SimpleTableRow<T>);
    }
    public bool Equals(SimpleTableRow<T> other)
    {
        // Check for null
        if (ReferenceEquals(other, null))
            return false;

        // Check for same reference
        if (ReferenceEquals(this, other))
            return true;

        // Check for same Id and same Values
        return Id == other.Id && Values.SequenceEqual(other.Values);
    }
}

Solution 2 - C#

FWIW, it's very dangerous to use the contents of the Values in your hash code. You should only do this if you can guarantee that it will never change. However, since it is exposed, I don't think guaranteeing it is possible. The hashcode of an object should never change. Otherwise, it loses its value as a key in a Hashtable or Dictionary. Consider the hard-to-find bug of using an object as a key in a Hashtable, its hashcode changes because of an outside influence and you can no longer find it in the Hashtable!

Solution 3 - C#

Since the hashCode is kinda a key for storing the object (lllike in a hashtable), i would use just Id.GetHashCode()

Solution 4 - C#

How about something like:

    public override int GetHashCode()
    {
        int hash = Id;
        if (Values != null)
        {
            hash = (hash * 17) + Values.Length;
            foreach (T t in Values)
            {
                hash *= 17;
                if (t != null) hash = hash + t.GetHashCode();
            }
        }
        return hash;
    }

This should be compatible with SequenceEqual, rather than doing a reference comparison on the array.

Solution 5 - C#

public override int GetHashCode() {
   return Id.GetHashCode() ^ Values.GetHashCode();  
}

There are several good points in the comments and other answers. The OP should consider whether the Values would be used as part of the "key" if the object were used as a key in a dictionary. If so, then they should be part of the hash code, otherwise, not.

On the other hand, I'm not sure why the GetHashCode method should mirror SequenceEqual. It's meant to compute an index into a hash table, not to be the complete determinant of equality. If there are many hash table collisions using the algorithm above, and if they differ in the sequence of the Values, then an algorithm should be chosen that takes sequence into account. If sequence doesn't really matter, save the time and don't take it into account.

Solution 6 - C#

I just had to add another answer because one of the more obvious (and easiest to implement) solutions were not mentioned - not including the collection in your GetHashCode calculation!

The main thing that seemed to have forgotten here is that the uniqueness from the result of GetHashCode isn't required (or in many cases even possible). Unequal objects don't have to return unequal hash codes, the only requirement is that equal objects return equal hash codes. So by that definition, the following implementation of GetHashCode is correct for all objects (assuming there's a correct Equals implementation):

public override int GetHashCode() 
{ 
    return 42; 
} 

Of course this would yield the worst possible performance in hashtable lookup, O(n) instead of O(1), but it is still functionally correct.

With that in mind, my general recommendation when implementing GetHashCode for an object that happens to have any kind of collection as one or more of its members is to simply ignore them and calculate GetHashCode solely based on the other scalar members. This would work pretty well except if you put into a hash table a huge number of objects where all their scalar members have identical values, resulting in identical hash codes.

Ignoring collection members when calculating the hash code can also yield a performance improvement, despite the decreased distribution of the hash code values. Remember that using a hash code is supposed to improve performance in a hash table by not requiring to call Equals N times, and instead will only require calling GetHashCode once and a quick hash table lookup. If each object has an inner array with 10,000 items which all participate in the calculation of the hash code, any benefits gained by the good distribution would probably be lost. It would be better to have a marginally less distributed hash code if generating it is considerably less costly.

Solution 7 - C#

I would do it this way:

long result = Id.GetHashCode();
foreach(T val in Values)
    result ^= val.GetHashCode();
return result;

Solution 8 - C#

Provided that Id and Values will never change, and Values is not null...

public override int GetHashCode()
{
  return Id ^ Values.GetHashCode();
}

Note that your class is not immutable, since anyone can modify the contents of Values because it is an array. Given that, I wouldn't try to generate a hashcode using its contents.

Solution 9 - C#

I know this thread is pretty old, but I wrote this method to allow me to calculate hashcodes of multiple objects. It's been very helpful for this very case. It's not perfect, but it does meet my needs and most likely yours too.

I can't really take any credit for it. I got the concept from some of the .net gethashcode implementations. I'm using 419 (afterall, it's my favorite large prime), but you can choose just about any reasonable prime (not too small . . . not too large).

So, here's how I get my hashcodes:

using System.Collections.Generic;
using System.Linq;

public static class HashCodeCalculator
{
    public static int CalculateHashCode(params object[] args)
    {
        return args.CalculateHashCode();
    }

    public static int CalculateHashCode(this IEnumerable<object> args)
    {
        if (args == null)
            return new object().GetHashCode();

        unchecked
        {
            return args.Aggregate(0, (current, next) => (current*419) ^ (next ?? new object()).GetHashCode());
        }
    }
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSvishView Question on Stackoverflow
Solution 1 - C#Marc GravellView Answer on Stackoverflow
Solution 2 - C#Dustin CampbellView Answer on Stackoverflow
Solution 3 - C#Jhonny D. Cano -Leftware-View Answer on Stackoverflow
Solution 4 - C#Marc GravellView Answer on Stackoverflow
Solution 5 - C#John SaundersView Answer on Stackoverflow
Solution 6 - C#Allon GuralnekView Answer on Stackoverflow
Solution 7 - C#GrzenioView Answer on Stackoverflow
Solution 8 - C#Dustin CampbellView Answer on Stackoverflow
Solution 9 - C#D. PatrickView Answer on Stackoverflow