Efficient intersection of two List<String> in Java?
JavaListIntersectionJava Problem Overview
Question is simple:
I have two List
List<String> columnsOld = DBUtils.GetColumns(db, TableName);
List<String> columnsNew = DBUtils.GetColumns(db, TableName);
And I need to get the intersection of these. Is there a quick way to achieve this?
Java Solutions
Solution 1 - Java
You can use [retainAll
][1] method:
columnsOld.retainAll (columnsNew);
[1]: http://docs.oracle.com/javase/7/docs/api/java/util/List.html#retainAll(java.util.Collection
Solution 2 - Java
Using Google's Guava library:
Sets.intersection(Sets.newHashSet(setA), Sets.newHashSet(setB))
Note: This is much more efficient than naively doing the intersection with two lists: it's O(n+m), versus O(n×m) for the list version. With two million-item lists it's the difference between millions of operations and trillions of operations.
Solution 3 - Java
Since retainAll won't touch the argument collection, this would be faster:
List<String> columnsOld = DBUtils.GetColumns(db, TableName);
List<String> columnsNew = DBUtils.GetColumns(db, TableName);
for(int i = columnsNew.size() - 1; i > -1; --i){
String str = columnsNew.get(i);
if(!columnsOld.remove(str))
columnsNew.remove(str);
}
The intersection will be the values left in columnsNew. Removing already compared values fom columnsOld will reduce the number of comparisons needed.
Solution 4 - Java
How about
private List<String> intersect(List<String> A, List<String> B) {
List<String> rtnList = new LinkedList<>();
for(String dto : A) {
if(B.contains(dto)) {
rtnList.add(dto);
}
}
return rtnList;
}
Solution 5 - Java
There is a nice way with streams which can do this in one line of code and you can two lists which are not from the same type which is not possible with the containsAll method afaik:
columnsOld.stream().filter(c -> columnsNew.contains(c)).collect(Collectors.toList());
An example for lists with different types. If you have a realtion between foo and bar and you can get a bar-object from foo than you can modify your stream:
List<foo> fooList = new ArrayList<>(Arrays.asList(new foo(), new foo()));
List<bar> barList = new ArrayList<>(Arrays.asList(new bar(), new bar()));
fooList.stream().filter(f -> barList.contains(f.getBar()).collect(Collectors.toList());
Solution 6 - Java
using retainAll if don't care occurrences, otherwise using N.intersection
a = N.asList(12, 16, 16, 17, 19);
b = N.asList(16, 19, 107);
a.retainAll(b); // [16, 16, 19]
N.println(a);
a = N.asList(12, 16, 16, 17, 19);
b = N.asList(16, 19, 107);
a = N.intersect(a, b);
N.println(a); // [16, 19]
N is an utility class in abacus-common
Solution 7 - Java
If you put the second list in a set say HashSet. And just iterate over the first list checking for presence on the set and removing if not present, your first list will eventually have the intersection you need. It will be way faster than retainAll or contains on a list. The emphasis here is to use a set instead of list. Lookups are O(1). firstList.retainAll (new HashSet (secondList)) will also work.
Solution 8 - Java
use org.apache.commons.collections4.ListUtils#intersection
Solution 9 - Java
With Java 8 Stream API (and Java 9 List.of()) you can do following:
List<Integer> list1 = List.of(1, 1, 2, 2);
List<Integer> list2 = List.of(2, 2, 3, 3);
List<Integer> intersection = list1.stream()
.filter(list2::contains)
.distinct()
.collect(Collectors.toList());