What is the difference between thenApply and thenApplyAsync of Java CompletableFuture?
JavaCompletable FutureJava Problem Overview
Suppose I have the following code:
CompletableFuture<Integer> future
= CompletableFuture.supplyAsync( () -> 0);
thenApply
case:
future.thenApply( x -> x + 1 )
.thenApply( x -> x + 1 )
.thenAccept( x -> System.out.println(x));
Here the output will be 2. Now in case of thenApplyAsync
:
future.thenApplyAsync( x -> x + 1 ) // first step
.thenApplyAsync( x -> x + 1 ) // second step
.thenAccept( x -> System.out.println(x)); // third step
I read in this blog that each thenApplyAsync
are executed in a separate thread and 'at the same time'(that means following thenApplyAsyncs
started before preceding thenApplyAsyncs
finish), if so, what is the input argument value of the second step if the first step not finished?
Where will the result of the first step go if not taken by the second step? the third step will take which step's result?
If the second step has to wait for the result of the first step then what is the point of Async
?
Here x -> x + 1 is just to show the point, what I want know is in cases of very long computation.
Java Solutions
Solution 1 - Java
The difference has to do with the Executor
that is responsible for running the code. Each operator on CompletableFuture
generally has 3 versions.
thenApply(fn)
- runsfn
on a thread defined by theCompleteableFuture
on which it is called, so you generally cannot know where this will be executed. It might immediately execute if the result is already available.thenApplyAsync(fn)
- runsfn
on a environment-defined executor regardless of circumstances. ForCompletableFuture
this will generally beForkJoinPool.commonPool()
.thenApplyAsync(fn,exec)
- runsfn
onexec
.
In the end the result is the same, but the scheduling behavior depends on the choice of method.
Solution 2 - Java
You're mis-quoting the article's examples, and so you're applying the article's conclusion incorrectly. I see two question in your question:
What is the correct usage of .then___()
In both examples you quoted, which is not in the article, the second function has to wait for the first function to complete. Whenever you call a.then___(b -> ...)
, input b
is the result of a
and has to wait for a
to complete, regardless of whether you use the methods named Async
or not. The article's conclusion does not apply because you mis-quoted it.
The example in the article is actually
CompletableFuture<String> receiver = CompletableFuture.supplyAsync(this::findReceiver);
receiver.thenApplyAsync(this::sendMsg);
receiver.thenApplyAsync(this::sendMsg);
Notice the thenApplyAsync
both applied on receiver
, not chained in the same statement. This means both function can start once receiver
completes, in an unspecified order. (Any assumption of order is implementation dependent.)
To put it more clearly:
a.thenApply(b).thenApply(c);
means the order is a
finishes then b
starts, b
finishes, then c
starts.
a.thenApplyAsync(b).thenApplyAsync(c);
will behave exactly the same as above as far as the ordering between a
b
c
is concerned.
a.thenApply(b); a.thenApply(c);
means a
finishes, then b
or c
can start, in any order. b
and c
don't have to wait for each other.
a.thenApplyAync(b); a.thenApplyAsync(c);
works the same way, as far as the order is concerned.
You should understand the above before reading the below. The above concerns asynchronous programming, without it you won't be able to use the APIs correctly. The below concerns thread management, with which you can optimize your program and avoid performance pitfalls. But you can't optimize your program without writing it correctly.
As titled: Difference between thenApply
and thenApplyAsync
of Java CompletableFuture?
I must point out that the people who wrote the JSR must have confused the technical term "Asynchronous Programming", and picked the names that are now confusing newcomers and veterans alike. To start, there is nothing in thenApplyAsync
that is more asynchronous than thenApply
from the contract of these methods.
The difference between the two has to do with on which thread the function is run. The function supplied to thenApply
may run on any of the threads that
- calls
complete
- calls
thenApply
on the same instance
while the 2 overloads of thenApplyAsync
either
- uses a default
Executor
(a.k.a. thread pool), or - uses a supplied
Executor
The take away is that for thenApply
, the runtime promises to eventually run your function using some executor which you do not control. If you want control of threads, use the Async variants.
If your function is lightweight, it doesn't matter which thread runs your function.
If your function is heavy CPU bound, you do not want to leave it to the runtime. If the runtime picks the network thread to run your function, the network thread can't spend time to handle network requests, causing network requests to wait longer in the queue and your server to become unresponsive. In that case you want to use thenApplyAsync
with your own thread pool.
Fun fact: Asynchrony != threads
thenApply
/thenApplyAsync
, and their counterparts thenCompose
/thenComposeAsync
, handle
/handleAsync
, thenAccept
/thenAcceptAsync
, are all asynchronous! The asynchronous nature of these function has to do with the fact that an asynchronous operation eventually calls complete
or completeExceptionally
. The idea came from Javascript, which is indeed asynchronous but isn't multi-threaded.
Solution 3 - Java
This is what the documentation says about CompletableFuture's
thenApplyAsync
:
> Returns a new CompletionStage that, when this stage completes > normally, is executed using this stage's default asynchronous > execution facility, with this stage's result as the argument to the > supplied function.
So, thenApplyAsync
has to wait for the previous thenApplyAsync's
result:
In your case you first do the synchronous work and then the asynchronous one. So, it does not matter that the second one is asynchronous because it is started only after the synchrounous work has finished.
Let's switch it up. In some cases "async result: 2" will be printed first and in some cases "sync result: 2" will be printed first. Here it makes a difference because both call 1 and 2 can run asynchronously, call 1 on a separate thread and call 2 on some other thread, which might be the main thread.
CompletableFuture<Integer> future
= CompletableFuture.supplyAsync(() -> 0);
future.thenApplyAsync(x -> x + 1) // call 1
.thenApplyAsync(x -> x + 1)
.thenAccept(x -> System.out.println("async result: " + x));
future.thenApply(x -> x + 1) // call 2
.thenApply(x -> x + 1)
.thenAccept(x -> System.out.println("sync result:" + x));
Solution 4 - Java
The second step (i.e. computation) will always be executed after the first step.
> If the second step has to wait for the result of the first step then what is the point of Async?
Async means in this case that you are guaranteed that the method will return quickly and the computation will be executed in a different thread.
When calling thenApply
(without async), then you have no such guarantee. In this case the computation may be executed synchronously i.e. in the same thread that calls thenApply
if the CompletableFuture is already completed by the time the method is called. But the computation may also be executed asynchronously by the thread that completes the future or some other thread that calls a method on the same CompletableFuture
. This answer: https://stackoverflow.com/a/46062939/1235217 explained in detail what thenApply does and does not guarantee.
So when should you use thenApply
and when thenApplyAsync
? I use the following rule of thumb:
- non-async: only if the task is very small and non-blocking, because in this case we don't care which of the possible threads executes it
- async (often with an explicit executor as parameter): for all other tasks