Create a hashcode of two numbers
.NetAlgorithm.Net Problem Overview
I am trying to create a quick hashcode function for a complex number class (a + b)
in C#.
I have seen repeatedly the a.GetHashcode()^b.GetHashCode()
method.
But this will give the same hashcode for (a,b)
and (b,a)
.
Are there any standard algorithm to do this and are there any functions in the .Net framework to help?
.Net Solutions
Solution 1 - .Net
My normal way of creating a hashcode for an arbitrary set of hashable items:
int hash = 23;
hash = hash * 31 + item1Hash;
hash = hash * 31 + item2Hash;
hash = hash * 31 + item3Hash;
hash = hash * 31 + item4Hash;
hash = hash * 31 + item5Hash;
// etc
In your case item1Hash
could just be a
, and item2Hash
could just be b
.
The values of 23 and 31 are relatively unimportant, so long as they're primes (or at least coprime).
Obviously there will still be collisions, but you don't run into the normal nasty problems of:
hash(a, a) == hash(b, b)
hash(a, b) == hash(b, a)
If you know more about what the real values of a
and b
are likely to be you can probably do better, but this is a good initial implementation which is easy to remember and implement. Note that if there's any chance that you'll build the assembly with "check for arithmetic overflow/underflow" ticked, you should put it all in an unchecked block. (Overflow is fine for this algorithm.)
Solution 2 - .Net
Here's a possible approach that takes into account order. (The second method is defined as an extension method.)
public int GetHashCode()
{
return a.GetHashcode() ^ b.GetHashcode().RotateLeft(16);
}
public static uint RotateLeft(this uint value, int count)
{
return (value << count) | (value >> (32 - count))
}
It would certainly be interesting to see how the Complex
class of .NET 4.0 does it.
Solution 3 - .Net
One standard way is this:
hashcode = 23
hashcode = (hashcode * 37) + v1
hashcode = (hashcode * 37) + v2
23 and 37 are coprime, but you can use other numbers as well.
Solution 4 - .Net
What about this:
(a.GetHashcode() + b).GetHashcode()
Gives you a different code for (a,b) and (b,a) plus it's not really that fancy.
Solution 5 - .Net
@JonSkeet gives a fair, general-purpose algorithm for computing a hash code from n hash codes but assumes you already know which members of an object need to be hash, know what to do about null members, and ommits an implementation for n arbitrary items. So we expand upon his answer:
- Only public, immutable properties and fields should contribute to an objects hash code. They should be public (or isomorphic to the public) since we should be able to count on two objects with the same visible surface having the same hash code (hinting towards relationship between object equality and hash code equality), and they should be immutable since an object's hash code should never change in its life time (since then you might end up with an object in the wrong slot of a hash table!).
- null members should hash as a constant, such as 0
- @JonSkeet's algorithm is a text-book example for applying the functional programming higher-order function usually called
fold
(Aggregate
in C# LINQ), where23
is our seed and<hash accumulator> * 31 + <current item hash>
is our folding function:
In F#
let computeHashCode items =
items
|> Seq.map (fun item -> if item = null then 0 else item.GetHashCode())
|> Seq.fold (fun hash itemHash -> hash * 31 + itemHash) 23
In C#
Func<IEnumerable<Object>, int> computeHashCode = items =>
items
.Select(item => item == null ? 0 : item.GetHashCode())
.Aggregate(23, (hash, itemHash) => hash * 31 + itemHash);
Solution 6 - .Net
All that depends on what you're trying to achieve. If hashes are meant for hash structures like Dictionary
, then you have to balance collision rate and speed of hashing. To have a perfect hash without collision at all it will be more time consuming. Similarly the fastest hashing algorithm will have more collisions relatively. Finding the perfect balance is the key here. Also you should take into consideration how large your effective hash can be, and if hashing should be reversible! Noldorin's approach gives you perfect hash (read no collision) if your real and imaginary parts of your complex number are always positive. This will do even for negative numbers if you're ok with the rare collisions. But I'm concerned over the range of values it can yield, quite big for my taste.
If you're after perfect hashes (out of some academic/research interests) that should work even for negative numbers, you can see this solution (and an array of other solutions in the same thread). In my tests, it is faster and utilizes space better than any other I have seen.