Would a Java HashSet<String>'s contains() method test equality of the strings or object identity?
JavaStringReferenceHashsetJava Problem Overview
Let's say I have this code in Java:
HashSet<String> wordSet = new HashSet<String>();
String a = "hello";
String b = "hello";
wordSet.add(a);
Would wordSet.contains(b);
return true
or false
? From what I understand, a
and b
refer to different objects even though their values are the same. So contains()
should return false
. However, when I run this code, it returns true
. Will it always return true
no matter where String object b
is coming from as long as b
contains the value "hello"
? Am I guaranteed this always? If not, when am I not guaranteed this? And what if I wanted to do something similar with objects other than Strings?
Java Solutions
Solution 1 - Java
It uses equals()
to compare the data. Below is from the javadoc for Set
> adds the specified element e to this set if the set contains no > element e2 such that (e==null ? e2==null : e.equals(e2)).
The equals()
method for String does a character by character comparison. From the javadoc for String
> The result is true if and only if the argument is not null and is a String object that represents the same sequence of characters as this object
Solution 2 - Java
Actually, HashSet does neither.
Its implementation uses a HashMap, and here's the relevant code that determines if the set contains()
(actually it's inside HashMap's getEntry() method):
if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
which:
- requires the hashes to equal, and
- requires either object equality or
equals()
returns true
The answer is "yes": wordSet.contains(b)
will always return true
Solution 3 - Java
Actually, both a and b refer to the same object, because string literals in Java are automatically interned.
Solution 4 - Java
Two things:
-
A set would be pretty useless unless it called the equals() method to determine equality. wordset.contains(b) will return true because a.equals(b) == true.
-
You cannot be totally sure that a and b are pointing to different objects. Checkout String.intern() for more details.
Solution 5 - Java
Ultimately contains
will check for equals
method rather then its object id validation for contains method. Hence equals
method will be called for contains
call.
This is the call structure of contains
method.
private transient HashMap<E,Object> map;
public boolean contains(Object o) {
return map.containsKey(o);
}
public boolean containsKey(Object key) {
return getEntry(key) != null;
}
final Entry<K,V> getEntry(Object key) {
int hash = (key == null) ? 0 : hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
}
return null;
}
Solution 6 - Java
Equality. In your example, contains()
returns true, because the HashSet checks a.equals( b )
.