Stream and lazy evaluation
JavaJava 8Java StreamJava Problem Overview
I'm reading from the java 8 API on the stream abstraction but I don't understand this sentence very well:
> Intermediate operations return a new stream. They are always lazy; > executing an intermediate operation such as filter() does not actually > perform any filtering, but instead creates a new stream that, when > traversed, contains the elements of the initial stream that match the > given predicate. Traversal of the pipeline source does not begin until > the terminal operation of the pipeline is executed.
When a filter operation creates a new stream does that stream contain a filtered element? It seems to understand that the stream contains elements only when it is traversed i.e with a terminal operation. But, then, what does the filtered stream contain? I'm confused!!!
Java Solutions
Solution 1 - Java
It means that the filter is only applied during the terminal operation. Think of something like this:
public Stream filter(Predicate p) {
this.filter = p; // just store it, don't apply it yet
return this; // in reality: return a new stream
}
public List collect() {
for (Object o : stream) {
if (filter.test(o)) list.add(o);
}
return list;
}
(That does not compile and is a simplification of the reality but the principle is there)
Solution 2 - Java
Streams are lazy because intermediate operations are not evaluated unless terminal operation is invoked.
Each intermediate operation creates a new stream, stores the provided operation/function and return the new stream.
The pipeline accumulates these newly created streams.
The time when terminal operation is called, traversal of streams begins and the associated function is performed one by one.
Parallel streams don't evaluate streams 'one by one' (at terminal point). The operations are rather performed simultaneously, depending on the available cores.
Solution 3 - Java
It seems to me, that intermediate operation not exactly lazy:
List<String> l3 = new ArrayList<String>();
l3.add("first");
l3.add("second");
l3.add("third");
l3.add("fouth");
l3.add("fith");
l3.add("sixth");
List<String> test3 = new ArrayList<String>();
try {
l3.stream().filter(s -> { l3.clear(); test3.add(s); return true;}).forEach(System.out::println);
} catch (Exception ex) {
ex.printStackTrace();
System.out.println("!!! ");
System.out.println(test3.stream().reduce((s1, s2) -> s1 += " ;" + s2).get());
}
Otput:
first
null
null
null
null
null
java.util.ConcurrentModificationException
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1380)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at test.TestParallel.main(TestParallel.java:69)
!!!
first ;null ;null ;null ;null ;null
Looks like number of iteration sets on stream creation, but geting a new stream element lazy.
Compare to loop with counter:
public static void main(String[] args) {
List<Integer> list = new ArrayList<>();
list.add(1);
list.add(2);
list.add(3);
list.add(4);
list.add(5);
int i = 0;
while (i < list.size()) {
System.out.println(list.get(i++));
list.clear();
}
}
Output:
1
Only one expected iteration. I agree that problem in Exception throwing behavior in streams, but i think lazy means get data (or perform some action) only when i ask some object to do it; and count of data is also data.