Java Array HashCode implementation

JavaIntegerHashcode

Java Problem Overview


This is odd. A co-worker asked about the implementation of myArray.hashCode() in java. I thought I knew but then I ran a few tests. Check the code below. The odd thing I noticed is that when I wrote the first sys out the results were different. Note that it's almost like it's reporting a memory address and modifying the class moved the address or something. Just thought I would share.

int[] foo = new int[100000];
java.util.Random rand = new java.util.Random();

for(int a = 0; a < foo.length; a++) foo[a] = rand.nextInt();

int[] bar = new int[100000];
int[] baz = new int[100000];
int[] bax = new int[100000];
for(int a = 0; a < foo.length; a++) bar[a] = baz[a] = bax[a] = foo[a];

System.out.println(foo.hashCode() + " ----- " + bar.hashCode() + " ----- " + baz.hashCode() +  " ----- " + bax.hashCode());

// returns 4097744 ----- 328041 ----- 2083945 ----- 2438296
// Consistently unless you modify the class.  Very weird
// Before adding the comments below it returned this:
// 4177328 ----- 4097744 ----- 328041 ----- 2083945


System.out.println("Equal ?? " +
  (java.util.Arrays.equals(foo, bar) && java.util.Arrays.equals(bar, baz) &&
  java.util.Arrays.equals(baz, bax) && java.util.Arrays.equals(foo, bax)));

Java Solutions


Solution 1 - Java

The java.lang.Array hashCode method is inherited from Object, which means the hashcode depends on the reference. To get the hashcode based on the content of the array use Arrays.hashCode.

Beware though its a shallow hashcode implementation. A deep implementation is also present Arrays.deepHashCode.

Solution 2 - Java

Arrays use the default hash code, which is based on memory location (but it isn't necessarily the memory location, since it's only an int and all memory addresses won't fit). You can see this by also printing the result of System.identityHashCode(foo).

Arrays are only equal if they are the same, identical array. So, array hash codes will only be equal, generally, if they are the same, identical array.

Solution 3 - Java

The default implementation for Object.hashCode() is indeed to return the pointer value of the object, although this is implementation dependent. For instance, a 64-bit JVM may take the pointer and XOR and high and low order words together. Subclasses are encouraged to override this behavior if it makes sense.

However, it does not make sense to perform equality comparisons on mutatable arrays. If an element changes, then the two are no longer equal. To maintain the invariant that the same array will always return the same hashCode no matter what happens to its elements, arrays do not override the default hashcode behavior.

Note that java.util.Arrays provides a deepHashCode() implementation for when hashing based on the contents of the array, rather than the identity of the array itself, is important.

Solution 4 - Java

I agree with using java.util.Arrays.hashCode (or the google guava generic wrapper Objects.hashcode) but be aware that this can cause issues if you are using Terracotta - see [this link][1]

[1]: http://forums.terracotta.org/forums/posts/list/4379.page "this link"

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionsouLTowerView Question on Stackoverflow
Solution 1 - JavaMahdeToView Answer on Stackoverflow
Solution 2 - JavaericksonView Answer on Stackoverflow
Solution 3 - JavaJamesView Answer on Stackoverflow
Solution 4 - JavaCarl PritchettView Answer on Stackoverflow