Should you check if the map containsKey before using ConcurrentMap's putIfAbsent

JavaPerformanceConcurrencyConcurrenthashmap

Java Problem Overview


I have been using Java's ConcurrentMap for a map that can be used from multiple threads. The putIfAbsent is a great method and is much easier to read/write than using standard map operations. I have some code that looks like this:

ConcurrentMap<String, Set<X>> map = new ConcurrentHashMap<String, Set<X>>();

// ...

map.putIfAbsent(name, new HashSet<X>());
map.get(name).add(Y);

Readability wise this is great but it does require creating a new HashSet every time even if it is already in the map. I could write this:

if (!map.containsKey(name)) {
    map.putIfAbsent(name, new HashSet<X>());
}
map.get(name).add(Y);

With this change it loses a bit of readability but does not need to create the HashSet every time. Which is better in this case? I tend to side with the first one since it is more readable. The second would perform better and may be more correct. Maybe there is a better way to do this than either of these.

What is the best practice for using a putIfAbsent in this manner?

Java Solutions


Solution 1 - Java

Concurrency is hard. If you are going to bother with concurrent maps instead of straightforward locking, you might as well go for it. Indeed, don't do lookups more than necessary.

Set<X> set = map.get(name);
if (set == null) {
    final Set<X> value = new HashSet<X>();
    set = map.putIfAbsent(name, value);
    if (set == null) {
        set = value;
    }
}

(Usual stackoverflow disclaimer: Off the top of my head. Not tested. Not compiled. Etc.)

Update: 1.8 has added computeIfAbsent default method to ConcurrentMap (and Map which is kind of interesting because that implementation would be wrong for ConcurrentMap). (And 1.7 added the "diamond operator" <>.)

Set<X> set = map.computeIfAbsent(name, n -> new HashSet<>());

(Note, you are responsible for the thread-safety of any operations of the HashSets contained in the ConcurrentMap.)

Solution 2 - Java

Tom's answer is correct as far as API usage goes for ConcurrentMap. An alternative that avoids using putIfAbsent is to use the computing map from the GoogleCollections/Guava MapMaker which auto-populates the values with a supplied function and handles all the thread-safety for you. It actually only creates one value per key and if the create function is expensive, other threads asking getting the same key will block until the value becomes available.

Edit from Guava 11, MapMaker is deprecated and being replaced with the Cache/LocalCache/CacheBuilder stuff. This is a little more complicated in its usage but basically isomorphic.

Solution 3 - Java

You can use MutableMap.getIfAbsentPut(K, Function0<? extends V>) from Eclipse Collections (formerly GS Collections).

The advantage over calling get(), doing a null check, and then calling putIfAbsent() is that we'll only compute the key's hashCode once, and find the right spot in the hashtable once. In ConcurrentMaps like org.eclipse.collections.impl.map.mutable.ConcurrentHashMap, the implementation of getIfAbsentPut() is also thread-safe and atomic.

import org.eclipse.collections.impl.map.mutable.ConcurrentHashMap;
...
ConcurrentHashMap<String, MyObject> map = new ConcurrentHashMap<>();
map.getIfAbsentPut("key", () -> someExpensiveComputation());

The implementation of org.eclipse.collections.impl.map.mutable.ConcurrentHashMap is truly non-blocking. While every effort is made not to call the factory function unnecessarily, there's still a chance it will be called more than once during contention.

This fact sets it apart from Java 8's ConcurrentHashMap.computeIfAbsent(K, Function<? super K,? extends V>). The Javadoc for this method states:

> The entire method invocation is performed atomically, so the function > is applied at most once per key. Some attempted update operations on > this map by other threads may be blocked while computation is in > progress, so the computation should be short and simple...

Note: I am a committer for Eclipse Collections.

Solution 4 - Java

By keeping a pre-initialized value for each thread you can improve on the accepted answer:

Set<X> initial = new HashSet<X>();
...
Set<X> set = map.putIfAbsent(name, initial);
if (set == null) {
    set = initial;
    initial = new HashSet<X>();
}
set.add(Y);

I recently used this with AtomicInteger map values rather than Set.

Solution 5 - Java

In 5+ years, I can't believe no one has mentioned or posted a solution that uses ThreadLocal to solve this problem; and several of the solutions on this page are not threadsafe and are just sloppy.

Using ThreadLocals for this specific problem isn't only considered best practices for concurrency, but for minimizing garbage/object creation during thread contention. Also, it's incredibly clean code.

For example:

private final ThreadLocal<HashSet<X>> 
  threadCache = new ThreadLocal<HashSet<X>>() {
      @Override
      protected
      HashSet<X> initialValue() {
          return new HashSet<X>();
      }
  };


private final ConcurrentMap<String, Set<X>> 
  map = new ConcurrentHashMap<String, Set<X>>();

And the actual logic...

// minimize object creation during thread contention
final Set<X> cached = threadCache.get();

Set<X> data = map.putIfAbsent("foo", cached);
if (data == null) {
    // reset the cached value in the ThreadLocal
    listCache.set(new HashSet<X>());
    data = cached;
}

// make sure that the access to the set is thread safe
synchronized(data) {
    data.add(object);
}

Solution 6 - Java

My generic approximation:

public class ConcurrentHashMapWithInit<K, V> extends ConcurrentHashMap<K, V> {
  private static final long serialVersionUID = 42L;

  public V initIfAbsent(final K key) {
    V value = get(key);
    if (value == null) {
      value = initialValue();
      final V x = putIfAbsent(key, value);
      value = (x != null) ? x : value;
    }
    return value;
  }

  protected V initialValue() {
    return null;
  }
}

And as example of use:

public static void main(final String[] args) throws Throwable {
  ConcurrentHashMapWithInit<String, HashSet<String>> map = 
        new ConcurrentHashMapWithInit<String, HashSet<String>>() {
    private static final long serialVersionUID = 42L;

    @Override
    protected HashSet<String> initialValue() {
      return new HashSet<String>();
    }
  };
  map.initIfAbsent("s1").add("chao");
  map.initIfAbsent("s2").add("bye");
  System.out.println(map.toString());
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionChris DailView Question on Stackoverflow
Solution 1 - JavaTom Hawtin - tacklineView Answer on Stackoverflow
Solution 2 - JavaJed Wesley-SmithView Answer on Stackoverflow
Solution 3 - JavaCraig P. MotlinView Answer on Stackoverflow
Solution 4 - JavakarmakazeView Answer on Stackoverflow
Solution 5 - JavaNathanView Answer on Stackoverflow
Solution 6 - JavaggrandesView Answer on Stackoverflow