Create a hashcode of two numbers

.NetAlgorithm

.Net Problem Overview


I am trying to create a quick hashcode function for a complex number class (a + b) in C#.

I have seen repeatedly the a.GetHashcode()^b.GetHashCode() method. But this will give the same hashcode for (a,b) and (b,a).

Are there any standard algorithm to do this and are there any functions in the .Net framework to help?

.Net Solutions


Solution 1 - .Net

My normal way of creating a hashcode for an arbitrary set of hashable items:

int hash = 23;
hash = hash * 31 + item1Hash;
hash = hash * 31 + item2Hash;
hash = hash * 31 + item3Hash;
hash = hash * 31 + item4Hash;
hash = hash * 31 + item5Hash;
// etc

In your case item1Hash could just be a, and item2Hash could just be b.

The values of 23 and 31 are relatively unimportant, so long as they're primes (or at least coprime).

Obviously there will still be collisions, but you don't run into the normal nasty problems of:

hash(a, a) == hash(b, b)
hash(a, b) == hash(b, a)

If you know more about what the real values of a and b are likely to be you can probably do better, but this is a good initial implementation which is easy to remember and implement. Note that if there's any chance that you'll build the assembly with "check for arithmetic overflow/underflow" ticked, you should put it all in an unchecked block. (Overflow is fine for this algorithm.)

Solution 2 - .Net

Here's a possible approach that takes into account order. (The second method is defined as an extension method.)

public int GetHashCode()
{
    return a.GetHashcode() ^ b.GetHashcode().RotateLeft(16);
}

public static uint RotateLeft(this uint value, int count)
{
    return (value << count) | (value >> (32 - count))
}

It would certainly be interesting to see how the Complex class of .NET 4.0 does it.

Solution 3 - .Net

One standard way is this:

hashcode = 23
hashcode = (hashcode * 37) + v1
hashcode = (hashcode * 37) + v2

23 and 37 are coprime, but you can use other numbers as well.

Solution 4 - .Net

What about this:

(a.GetHashcode() + b).GetHashcode()

Gives you a different code for (a,b) and (b,a) plus it's not really that fancy.

Solution 5 - .Net

@JonSkeet gives a fair, general-purpose algorithm for computing a hash code from n hash codes but assumes you already know which members of an object need to be hash, know what to do about null members, and ommits an implementation for n arbitrary items. So we expand upon his answer:

  1. Only public, immutable properties and fields should contribute to an objects hash code. They should be public (or isomorphic to the public) since we should be able to count on two objects with the same visible surface having the same hash code (hinting towards relationship between object equality and hash code equality), and they should be immutable since an object's hash code should never change in its life time (since then you might end up with an object in the wrong slot of a hash table!).
  2. null members should hash as a constant, such as 0
  3. @JonSkeet's algorithm is a text-book example for applying the functional programming higher-order function usually called fold (Aggregate in C# LINQ), where 23 is our seed and <hash accumulator> * 31 + <current item hash> is our folding function:

In F#

let computeHashCode items =
    items
    |> Seq.map (fun item -> if item = null then 0 else item.GetHashCode())
    |> Seq.fold (fun hash itemHash -> hash * 31 + itemHash) 23

In C#

Func<IEnumerable<Object>, int> computeHashCode = items =>
	items
	.Select(item => item == null ? 0 : item.GetHashCode())
	.Aggregate(23, (hash, itemHash) => hash * 31 + itemHash);

Solution 6 - .Net

All that depends on what you're trying to achieve. If hashes are meant for hash structures like Dictionary, then you have to balance collision rate and speed of hashing. To have a perfect hash without collision at all it will be more time consuming. Similarly the fastest hashing algorithm will have more collisions relatively. Finding the perfect balance is the key here. Also you should take into consideration how large your effective hash can be, and if hashing should be reversible! Noldorin's approach gives you perfect hash (read no collision) if your real and imaginary parts of your complex number are always positive. This will do even for negative numbers if you're ok with the rare collisions. But I'm concerned over the range of values it can yield, quite big for my taste.

If you're after perfect hashes (out of some academic/research interests) that should work even for negative numbers, you can see this solution (and an array of other solutions in the same thread). In my tests, it is faster and utilizes space better than any other I have seen.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJDunkerleyView Question on Stackoverflow
Solution 1 - .NetJon SkeetView Answer on Stackoverflow
Solution 2 - .NetNoldorinView Answer on Stackoverflow
Solution 3 - .NetLasse V. KarlsenView Answer on Stackoverflow
Solution 4 - .NetWelbogView Answer on Stackoverflow
Solution 5 - .NetStephen SwensenView Answer on Stackoverflow
Solution 6 - .NetnawfalView Answer on Stackoverflow