Is it a good idea to store data as keys in HashMap with empty/null values?

JavaArraysPerformanceHashmapAsymptotic Complexity

Java Problem Overview


I had originally written an ArrayList and stored unique values (usernames, i.e. Strings) in it. I later needed to use the ArrayList to search if a user existed in it. That's O(n) for the search.

My tech lead wanted me to change that to a HashMap and store the usernames as keys in the array and values as empty Strings.

So, in Java -

hashmap.put("johndoe","");

I can see if this user exists later by running -

hashmap.containsKey("johndoe"); 

This is O(1) right?

My lead said this was a more efficient way to do this and it made sense to me, but it just seemed a bit off to put null/empty as values in the hashmap and store elements in it as keys.

My question is, is this a good approach? The efficiency beats ArrayList#contains or an array search in general. It works. My worry is, I haven't seen anyone else do this after a search. I may be missing an obvious issue somewhere but I can't see it.

Java Solutions


Solution 1 - Java

Since you have a set of unique values, a Set is the appropriate data structure. You can put your values inside HashSet, an implementation of the Set interface.

> My lead said this was a more efficient way to do this and it made sense to me, but it just seemed a bit off to put null/empty as values in the hashmap and store elements in it as keys.

The advice of the lead is flawed. Map is not the right abstraction for this, Set is. A Map is appropriate for key-value pairs. But you don't have values, only keys.

Example usage:

Set<String> users = new HashSet<>(Arrays.asList("Alice", "Bob"));

System.out.println(users.contains("Alice"));
// -> prints true

System.out.println(users.contains("Jack"));
// -> prints false

Using a Map would be awkward, because what should be the type of the values? That question makes no sense in your use case, as you have just keys, not key-value pairs. With a Set, you don't need to ask that, the usage is perfectly natural.

> This is O(1) right?

Yes, searching in a HashMap or a HashSet is O(1) amortized worst case, while searching in a List or an array is O(n) worst case.


Some comments point out that a HashSet is implemented in terms of HashMap. That's fine, at that level of abstraction. At the level of abstraction of the task at hand --- to store a collection of unique usernames, using a set is a natural choice, more natural than a map.

Solution 2 - Java

This is basically how HashSet is implemented, so I guess you can say it's a good approach. You might as well use HashSet instead of your HashMap with empty values.

For example :

HashSet's implementation of add is

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

where map is the backing HashMap and PRESENT is a dummy value.

>My worry is, I haven't seen anyone else do this after a search. I may be missing an obvious issue somewhere but I can't see it.

As I mentioned, the developers of the JDK are using this same approach.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiondozerView Question on Stackoverflow
Solution 1 - JavajanosView Answer on Stackoverflow
Solution 2 - JavaEranView Answer on Stackoverflow