Remove duplicates from a list of objects based on property in Java 8
JavaListJava 8Java Problem Overview
I am trying to remove duplicates from a List of objects based on some property.
can we do it in a simple way using java 8
List<Employee> employee
Can we remove duplicates from it based on id
property of employee. I have seen posts removing duplicate strings form arraylist of string.
Java Solutions
Solution 1 - Java
You can get a stream from the List
and put in in the TreeSet
from which you provide a custom comparator that compares id uniquely.
Then if you really need a list you can put then back this collection into an ArrayList.
import static java.util.Comparator.comparingInt;
import static java.util.stream.Collectors.collectingAndThen;
import static java.util.stream.Collectors.toCollection;
...
List<Employee> unique = employee.stream()
.collect(collectingAndThen(toCollection(() -> new TreeSet<>(comparingInt(Employee::getId))),
ArrayList::new));
Given the example:
List<Employee> employee = Arrays.asList(new Employee(1, "John"), new Employee(1, "Bob"), new Employee(2, "Alice"));
It will output:
[Employee{id=1, name='John'}, Employee{id=2, name='Alice'}]
Another idea could be to use a wrapper that wraps an employee and have the equals and hashcode method based with its id:
class WrapperEmployee {
private Employee e;
public WrapperEmployee(Employee e) {
this.e = e;
}
public Employee unwrap() {
return this.e;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
WrapperEmployee that = (WrapperEmployee) o;
return Objects.equals(e.getId(), that.e.getId());
}
@Override
public int hashCode() {
return Objects.hash(e.getId());
}
}
Then you wrap each instance, call distinct()
, unwrap them and collect the result in a list.
List<Employee> unique = employee.stream()
.map(WrapperEmployee::new)
.distinct()
.map(WrapperEmployee::unwrap)
.collect(Collectors.toList());
In fact, I think you can make this wrapper generic by providing a function that will do the comparison:
public class Wrapper<T, U> {
private T t;
private Function<T, U> equalityFunction;
public Wrapper(T t, Function<T, U> equalityFunction) {
this.t = t;
this.equalityFunction = equalityFunction;
}
public T unwrap() {
return this.t;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
@SuppressWarnings("unchecked")
Wrapper<T, U> that = (Wrapper<T, U>) o;
return Objects.equals(equalityFunction.apply(this.t), that.equalityFunction.apply(that.t));
}
@Override
public int hashCode() {
return Objects.hash(equalityFunction.apply(this.t));
}
}
and the mapping will be:
.map(e -> new Wrapper<>(e, Employee::getId))
Solution 2 - Java
The easiest way to do it directly in the list is
HashSet<Object> seen=new HashSet<>();
employee.removeIf(e->!seen.add(e.getID()));
removeIf
will remove an element if it meets the specified criteriaSet.add
will returnfalse
if it did not modify theSet
, i.e. already contains the value- combining these two, it will remove all elements (employees) whose id has been encountered before
Of course, it only works if the list supports removal of elements.
Solution 3 - Java
If you can make use of equals
, then filter the list by using distinct
within a stream (see answers above). If you can not or don't want to override the equals
method, you can filter
the stream in the following way for any property, e.g. for the property Name (the same for the property Id etc.):
Set<String> nameSet = new HashSet<>();
List<Employee> employeesDistinctByName = employees.stream()
.filter(e -> nameSet.add(e.getName()))
.collect(Collectors.toList());
Solution 4 - Java
Another solution is to use a Predicate, then you can use this in any filter:
public static <T> Predicate<T> distinctBy(Function<? super T, ?> f) {
Set<Object> objects = new ConcurrentHashSet<>();
return t -> objects.add(f.apply(t));
}
Then simply reuse the predicate anywhere:
employees.stream().filter(distinctBy(e -> e.getId));
Note: in the JavaDoc of filter, which says it takes a stateless Predicte. Actually, this works fine even if the stream is parallel.
About other solutions:
-
Using
.collect(Collectors.toConcurrentMap(..)).values()
is a good solution, but it's annoying if you want to sort and keep the order. -
stream.removeIf(e->!seen.add(e.getID()));
is also another very good solution. But we need to make sure the collection implemented removeIf, for example it will throw exception if we construct the collection useArrays.asList(..)
.
Solution 5 - Java
Try this code:
Collection<Employee> nonDuplicatedEmployees = employees.stream()
.<Map<Integer, Employee>> collect(HashMap::new,(m,e)->m.put(e.getId(), e), Map::putAll)
.values();
Solution 6 - Java
This worked for me:
list.stream().distinct().collect(Collectors.toList());
You need to implement equals, of course
Solution 7 - Java
If order does not matter and when it's more performant to run in parallel, Collect to a Map and then get values:
employee.stream().collect(Collectors.toConcurrentMap(Employee::getId, Function.identity(), (p, q) -> p)).values()
Solution 8 - Java
There are a lot of good answers here but I didn't find the one about using reduce
method. So for your case, you can apply it in following way:
List<Employee> employeeList = employees.stream()
.reduce(new ArrayList<>(), (List<Employee> accumulator, Employee employee) ->
{
if (accumulator.stream().noneMatch(emp -> emp.getId().equals(employee.getId())))
{
accumulator.add(employee);
}
return accumulator;
}, (acc1, acc2) ->
{
acc1.addAll(acc2);
return acc1;
});
Solution 9 - Java
Another version which is simple
BiFunction<TreeSet<Employee>,List<Employee> ,TreeSet<Employee>> appendTree = (y,x) -> (y.addAll(x))? y:y;
TreeSet<Employee> outputList = appendTree.apply(new TreeSet<Employee>(Comparator.comparing(p->p.getId())),personList);