Ignore duplicates when producing map using streams
JavaJava 8Java StreamJava Problem Overview
Map<String, String> phoneBook = people.stream()
.collect(toMap(Person::getName,
Person::getAddress));
I get java.lang.IllegalStateException: Duplicate key
when a duplicated element is found.
Is it possible to ignore such exception on adding values to the map?
When there is duplicate it simply should continue by ignoring that duplicate key.
Java Solutions
Solution 1 - Java
This is possible using the mergeFunction
parameter of Collectors.toMap(keyMapper, valueMapper, mergeFunction)
:
Map<String, String> phoneBook =
people.stream()
.collect(Collectors.toMap(
Person::getName,
Person::getAddress,
(address1, address2) -> {
System.out.println("duplicate key found!");
return address1;
}
));
mergeFunction
is a function that operates on two values associated with the same key. adress1
corresponds to the first address that was encountered when collecting elements and adress2
corresponds to the second address encountered: this lambda just tells to keep the first address and ignores the second.
Solution 2 - Java
As said in JavaDocs:
> If the mapped keys contains duplicates (according to
> Object.equals(Object)
), an IllegalStateException
is thrown when the
> collection operation is performed. If the mapped keys may have
> duplicates, use toMap(Function keyMapper, Function valueMapper, BinaryOperator mergeFunction)
instead.
So you should use toMap(Function keyMapper, Function valueMapper, BinaryOperator mergeFunction)
instead. Just provide a merge function, that will determine which one of duplicates is put in the map.
For example, if you don't care which one, just call
Map<String, String> phoneBook =
people.stream()
.collect(Collectors.toMap(Person::getName,
Person::getAddress,
(a1, a2) -> a1));
Solution 3 - Java
The answer from alaster helped me a lot, but I would like to add meaningful information if someone is trying to group the data.
If you have, for example, two Orders
with the same code
but different quantity
of products for each one, and your desire is to sum the quantities, you can do the following:
List<Order> listQuantidade = new ArrayList<>();
listOrders.add(new Order("COD_1", 1L));
listOrders.add(new Order("COD_1", 5L));
listOrders.add(new Order("COD_1", 3L));
listOrders.add(new Order("COD_2", 3L));
listOrders.add(new Order("COD_3", 4L));
listOrders.collect(Collectors.toMap(Order::getCode,
o -> o.getQuantity(),
(o1, o2) -> o1 + o2));
Result:
{COD_3=4, COD_2=3, COD_1=9}
Or, from the javadocs, you can combine addresses:
Map<String, String> phoneBook
people.stream().collect(toMap(Person::getName,
Person::getAddress,
(s, a) -> s + ", " + a));
Solution 4 - Java
For grouping by Objects
Map<Integer, Data> dataMap = dataList.stream().collect(Collectors.toMap(Data::getId, data-> data, (data1, data2)-> {LOG.info("Duplicate Group For :" + data2.getId());return data1;}));
Solution 5 - Java
For anyone else getting this issue but without duplicate keys in the map being streamed, make sure your keyMapper function isn't returning null values.
It's very annoying to track this down because when it processes the second element, the Exception will say "Duplicate key 1" when 1 is actually the value of the entry instead of the key.
In my case, my keyMapper function tried to look up values in a different map, but due to a typo in the strings was returning null values.
final Map<String, String> doop = new HashMap<>();
doop.put("a", "1");
doop.put("b", "2");
final Map<String, String> lookup = new HashMap<>();
doop.put("c", "e");
doop.put("d", "f");
doop.entrySet().stream().collect(Collectors.toMap(e -> lookup.get(e.getKey()), e -> e.getValue()));
Solution 6 - Java
Feels like toMap
working often but not always is a dark underbelly of the java Streams. Like they should have called it toUniqueMap
or something...
The easiest way is to use Collectors.groupingBy
instead of Collectors.toMap
.
It will return a List
type output by default, but collision problem is gone, and that maybe what you want in the presence of multiples anyway.
Map<String, List<Person>> phoneBook = people.stream()
.collect(groupingBy((x) -> x.name));
If a Set
type collection of the addresses associated with a particular name, groupingBy
can do that as well:
Map<String, Set<String>> phoneBook = people.stream()
.collect(groupingBy((x) -> x.name, mapping((x) -> x.address, toSet())));
The other way is to "start" with either a Hash or a Set...And carefully track through to make sure the keys never duplicate in the output stream. Ugh. Here's an example that happens to survive this...sometimes...
Solution 7 - Java
I have encountered such a problem when grouping object, i always resolved them by a simple way: perform a custom filter using a java.util.Set to remove duplicate object with whatever attribute of your choice as bellow
Set<String> uniqueNames = new HashSet<>();
Map<String, String> phoneBook = people
.stream()
.filter(person -> person != null && !uniqueNames.add(person.getName()))
.collect(toMap(Person::getName, Person::getAddress));
Hope this helps anyone having the same problem !
Solution 8 - Java
For completeness, here's how to "reduce" duplicates down to just one.
If you are OK with the last:
Map<String, Person> phoneBook = people.stream()
.collect(groupingBy(x -> x.name, reducing(null, identity(), (first, last) -> last)));
If you want only the first:
Map<String, Person> phoneBook = people.stream()
.collect(groupingBy(x -> x.name, reducing(null, identity(), (first, last) -> first != null ? first : last)));
And if you want last but "address as String" (doesn't use the identity()
as a parameter).
Map<String, String> phoneBook = people.stream()
.collect(groupingBy(x -> x.name, reducing(null, x -> x.address, (first, last) -> last)));
So in essence groupingBy
paired with a reducing
collector starts to behave very similarly to the toMap
collector, having something similar to its mergeFunction...and identical end result...
Solution 9 - Java
One can use lambda function: the comparison is done on key string from key(...)
List<Blog> blogsNoDuplicates = blogs.stream()
.collect(toMap(b->key(b), b->b, (b1, b2) -> b1)) // b.getAuthor() <~>key(b) as Key criteria for Duplicate elimination
.values().stream().collect(Collectors.toList());
static String key(Blog b){
return b.getTitle()+b.getAuthor(); // make key as criteria of distinction
}
Solution 10 - Java
Assuming you have people is List of object
Map<String, String> phoneBook=people.stream()
.collect(toMap(Person::getName, Person::getAddress));
Now you need two steps :
people =removeDuplicate(people);
Map<String, String> phoneBook=people.stream()
.collect(toMap(Person::getName, Person::getAddress));
Here is method to remove duplicate
public static List removeDuplicate(Collection<Person> list) {
if(list ==null || list.isEmpty()){
return null;
}
Object removedDuplicateList =
list.stream()
.distinct()
.collect(Collectors.toList());
return (List) removedDuplicateList;
}
Adding full example here
package com.example.khan.vaquar;
import java.util.Arrays;
import java.util.Collection;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
public class RemovedDuplicate {
public static void main(String[] args) {
Person vaquar = new Person(1, "Vaquar", "Khan");
Person zidan = new Person(2, "Zidan", "Khan");
Person zerina = new Person(3, "Zerina", "Khan");
// Add some random persons
Collection<Person> duplicateList = Arrays.asList(vaquar, zidan, zerina, vaquar, zidan, vaquar);
//
System.out.println("Before removed duplicate list" + duplicateList);
//
Collection<Person> nonDuplicateList = removeDuplicate(duplicateList);
//
System.out.println("");
System.out.println("After removed duplicate list" + nonDuplicateList);
;
// 1) solution Working code
Map<Object, Object> k = nonDuplicateList.stream().distinct()
.collect(Collectors.toMap(s1 -> s1.getId(), s1 -> s1));
System.out.println("");
System.out.println("Result 1 using method_______________________________________________");
System.out.println("k" + k);
System.out.println("_____________________________________________________________________");
// 2) solution using inline distinct()
Map<Object, Object> k1 = duplicateList.stream().distinct()
.collect(Collectors.toMap(s1 -> s1.getId(), s1 -> s1));
System.out.println("");
System.out.println("Result 2 using inline_______________________________________________");
System.out.println("k1" + k1);
System.out.println("_____________________________________________________________________");
//breacking code
System.out.println("");
System.out.println("Throwing exception _______________________________________________");
Map<Object, Object> k2 = duplicateList.stream()
.collect(Collectors.toMap(s1 -> s1.getId(), s1 -> s1));
System.out.println("");
System.out.println("k2" + k2);
System.out.println("_____________________________________________________________________");
}
public static List removeDuplicate(Collection<Person> list) {
if (list == null || list.isEmpty()) {
return null;
}
Object removedDuplicateList = list.stream().distinct().collect(Collectors.toList());
return (List) removedDuplicateList;
}
}
// Model class
class Person {
public Person(Integer id, String fname, String lname) {
super();
this.id = id;
this.fname = fname;
this.lname = lname;
}
private Integer id;
private String fname;
private String lname;
// Getters and Setters
public Integer getId() {
return id;
}
public void setId(Integer id) {
this.id = id;
}
public String getFname() {
return fname;
}
public void setFname(String fname) {
this.fname = fname;
}
public String getLname() {
return lname;
}
public void setLname(String lname) {
this.lname = lname;
}
@Override
public String toString() {
return "Person [id=" + id + ", fname=" + fname + ", lname=" + lname + "]";
}
}
Results :
Before removed duplicate list[Person [id=1, fname=Vaquar, lname=Khan], Person [id=2, fname=Zidan, lname=Khan], Person [id=3, fname=Zerina, lname=Khan], Person [id=1, fname=Vaquar, lname=Khan], Person [id=2, fname=Zidan, lname=Khan], Person [id=1, fname=Vaquar, lname=Khan]]
After removed duplicate list[Person [id=1, fname=Vaquar, lname=Khan], Person [id=2, fname=Zidan, lname=Khan], Person [id=3, fname=Zerina, lname=Khan]]
Result 1 using method_______________________________________________
k{1=Person [id=1, fname=Vaquar, lname=Khan], 2=Person [id=2, fname=Zidan, lname=Khan], 3=Person [id=3, fname=Zerina, lname=Khan]}
_____________________________________________________________________
Result 2 using inline_______________________________________________
k1{1=Person [id=1, fname=Vaquar, lname=Khan], 2=Person [id=2, fname=Zidan, lname=Khan], 3=Person [id=3, fname=Zerina, lname=Khan]}
_____________________________________________________________________
Throwing exception _______________________________________________
Exception in thread "main" java.lang.IllegalStateException: Duplicate key Person [id=1, fname=Vaquar, lname=Khan]
at java.util.stream.Collectors.lambda$throwingMerger$0(Collectors.java:133)
at java.util.HashMap.merge(HashMap.java:1253)
at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1320)
at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at com.example.khan.vaquar.RemovedDuplicate.main(RemovedDuplicate.java:48)
Solution 11 - Java
I had the same case and found that the simplest solution (Assuming you want to just override the map value for duplicate key) is:
Map<String, String> phoneBook =
people.stream()
.collect(Collectors.toMap(Person::getName,
Person::getAddress,
(key1, key2)-> key2));