Custom thread pool in Java 8 parallel stream

JavaConcurrencyParallel ProcessingJava 8Java Stream

Java Problem Overview


Is it possible to specify a custom thread pool for Java 8 parallel stream? I can not find it anywhere.

Imagine that I have a server application and I would like to use parallel streams. But the application is large and multi-threaded so I want to compartmentalize it. I do not want a slow running task in one module of the applicationblock tasks from another module.

If I can not use different thread pools for different modules, it means I can not safely use parallel streams in most of real world situations.

Try the following example. There are some CPU intensive tasks executed in separate threads. The tasks leverage parallel streams. The first task is broken, so each step takes 1 second (simulated by thread sleep). The issue is that other threads get stuck and wait for the broken task to finish. This is contrived example, but imagine a servlet app and someone submitting a long running task to the shared fork join pool.

public class ParallelTest {
    public static void main(String[] args) throws InterruptedException {
        ExecutorService es = Executors.newCachedThreadPool();

        es.execute(() -> runTask(1000)); //incorrect task
        es.execute(() -> runTask(0));
        es.execute(() -> runTask(0));
        es.execute(() -> runTask(0));
        es.execute(() -> runTask(0));
        es.execute(() -> runTask(0));


        es.shutdown();
        es.awaitTermination(60, TimeUnit.SECONDS);
    }

    private static void runTask(int delay) {
        range(1, 1_000_000).parallel().filter(ParallelTest::isPrime).peek(i -> Utils.sleep(delay)).max()
                .ifPresent(max -> System.out.println(Thread.currentThread() + " " + max));
    }

    public static boolean isPrime(long n) {
        return n > 1 && rangeClosed(2, (long) sqrt(n)).noneMatch(divisor -> n % divisor == 0);
    }
}

Java Solutions


Solution 1 - Java

There actually is a trick how to execute a parallel operation in a specific fork-join pool. If you execute it as a task in a fork-join pool, it stays there and does not use the common one.

final int parallelism = 4;
ForkJoinPool forkJoinPool = null;
try {
    forkJoinPool = new ForkJoinPool(parallelism);
    final List<Integer> primes = forkJoinPool.submit(() ->
        // Parallel task here, for example
        IntStream.range(1, 1_000_000).parallel()
                .filter(PrimesPrint::isPrime)
                .boxed().collect(Collectors.toList())
    ).get();
    System.out.println(primes);
} catch (InterruptedException | ExecutionException e) {
    throw new RuntimeException(e);
} finally {
    if (forkJoinPool != null) {
        forkJoinPool.shutdown();
    }
}

The trick is based on ForkJoinTask.fork which specifies: "Arranges to asynchronously execute this task in the pool the current task is running in, if applicable, or using the ForkJoinPool.commonPool() if not inForkJoinPool()"

Solution 2 - Java

The parallel streams use the default ForkJoinPool.commonPool which by default has one less threads as you have processors, as returned by Runtime.getRuntime().availableProcessors() (This means that parallel streams leave one processor for the calling thread).

> For applications that require separate or custom pools, a ForkJoinPool may be constructed with a given target parallelism level; by default, equal to the number of available processors.

This also means if you have nested parallel streams or multiple parallel streams started concurrently, they will all share the same pool. Advantage: you will never use more than the default (number of available processors). Disadvantage: you may not get "all the processors" assigned to each parallel stream you initiate (if you happen to have more than one). (Apparently you can use a ManagedBlocker to circumvent that.)

To change the way parallel streams are executed, you can either

  • submit the parallel stream execution to your own ForkJoinPool: yourFJP.submit(() -> stream.parallel().forEach(soSomething)).get(); or
  • you can change the size of the common pool using system properties: System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "20") for a target parallelism of 20 threads.

Example of the latter on my machine which has 8 processors. If I run the following program:

long start = System.currentTimeMillis();
IntStream s = IntStream.range(0, 20);
//System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "20");
s.parallel().forEach(i -> {
    try { Thread.sleep(100); } catch (Exception ignore) {}
    System.out.print((System.currentTimeMillis() - start) + " ");
});

The output is:

> 215 216 216 216 216 216 216 216 315 316 316 316 316 316 316 316 415 416 416 416

So you can see that the parallel stream processes 8 items at a time, i.e. it uses 8 threads. However, if I uncomment the commented line, the output is:

> 215 215 215 215 215 216 216 216 216 216 216 216 216 216 216 216 216 216 216 216

This time, the parallel stream has used 20 threads and all 20 elements in the stream have been processed concurrently.

Solution 3 - Java

Alternatively to the trick of triggering the parallel computation inside your own forkJoinPool you can also pass that pool to the CompletableFuture.supplyAsync method like in:

ForkJoinPool forkJoinPool = new ForkJoinPool(2);
CompletableFuture<List<Integer>> primes = CompletableFuture.supplyAsync(() ->
    //parallel task here, for example
    range(1, 1_000_000).parallel().filter(PrimesPrint::isPrime).collect(toList()), 
    forkJoinPool
);

Solution 4 - Java

The original solution (setting the ForkJoinPool common parallelism property) no longer works. Looking at the links in the original answer, an update which breaks this has been back ported to Java 8. As mentioned in the linked threads, this solution was not guaranteed to work forever. Based on that, the solution is the forkjoinpool.submit with .get solution discussed in the accepted answer. I think the backport fixes the unreliability of this solution also.

ForkJoinPool fjpool = new ForkJoinPool(10);
System.out.println("stream.parallel");
IntStream range = IntStream.range(0, 20);
fjpool.submit(() -> range.parallel()
        .forEach((int theInt) ->
        {
            try { Thread.sleep(100); } catch (Exception ignore) {}
            System.out.println(Thread.currentThread().getName() + " -- " + theInt);
        })).get();
System.out.println("list.parallelStream");
int [] array = IntStream.range(0, 20).toArray();
List<Integer> list = new ArrayList<>();
for (int theInt: array)
{
    list.add(theInt);
}
fjpool.submit(() -> list.parallelStream()
        .forEach((theInt) ->
        {
            try { Thread.sleep(100); } catch (Exception ignore) {}
            System.out.println(Thread.currentThread().getName() + " -- " + theInt);
        })).get();

Solution 5 - Java

We can change the default parallelism using the following property:

-Djava.util.concurrent.ForkJoinPool.common.parallelism=16

which can set up to use more parallelism.

Solution 6 - Java

To measure the actual number of used threads, you can check Thread.activeCount():

    Runnable r = () -> IntStream
            .range(-42, +42)
            .parallel()
            .map(i -> Thread.activeCount())
            .max()
            .ifPresent(System.out::println);
    
    ForkJoinPool.commonPool().submit(r).join();
    new ForkJoinPool(42).submit(r).join();

This can produce on a 4-core CPU an output like:

5 // common pool
23 // custom pool

Without .parallel() it gives:

3 // common pool
4 // custom pool

Solution 7 - Java

Until now, I used the solutions described in the answers of this question. Now, I came up with a little library called Parallel Stream Support for that:

ForkJoinPool pool = new ForkJoinPool(NR_OF_THREADS);
ParallelIntStreamSupport.range(1, 1_000_000, pool)
    .filter(PrimesPrint::isPrime)
    .collect(toList())

But as @PabloMatiasGomez pointed out in the comments, there are drawbacks regarding the splitting mechanism of parallel streams which depends heavily on the size of the common pool. See https://stackoverflow.com/questions/28985704/parallel-stream-from-a-hashset-doesnt-run-in-parallel .

I am using this solution only to have separate pools for different types of work but I can not set the size of the common pool to 1 even if I don't use it.

Solution 8 - Java

Note: There appears to be a fix implemented in JDK 10 that ensures the Custom Thread Pool uses the expected number of threads.

Parallel stream execution within a custom ForkJoinPool should obey the parallelism https://bugs.openjdk.java.net/browse/JDK-8190974

Solution 9 - Java

If you don't want to rely on implementation hacks, there's always a way to achieve the same by implementing custom collectors that will combine map and collect semantics... and you wouldn't be limited to ForkJoinPool:

list.stream()
  .collect(parallel(i -> process(i), executor, 4))
  .join()

Luckily, it's done already here and available on Maven Central: http://github.com/pivovarit/parallel-collectors

Disclaimer: I wrote it and take responsibility for it.

Solution 10 - Java

Go to get AbacusUtil. Thread number can by specified for parallel stream. Here is the sample code:

LongStream.range(4, 1_000_000).parallel(threadNum)...

Disclosure: I'm the developer of AbacusUtil.

Solution 11 - Java

I tried the custom ForkJoinPool as follows to adjust the pool size:

private static Set<String> ThreadNameSet = new HashSet<>();
private static Callable<Long> getSum() {
    List<Long> aList = LongStream.rangeClosed(0, 10_000_000).boxed().collect(Collectors.toList());
    return () -> aList.parallelStream()
            .peek((i) -> {
                String threadName = Thread.currentThread().getName();
                ThreadNameSet.add(threadName);
            })
            .reduce(0L, Long::sum);
}

private static void testForkJoinPool() {
    final int parallelism = 10;

    ForkJoinPool forkJoinPool = null;
    Long result = 0L;
    try {
        forkJoinPool = new ForkJoinPool(parallelism);
        result = forkJoinPool.submit(getSum()).get(); //this makes it an overall blocking call

    } catch (InterruptedException | ExecutionException e) {
        e.printStackTrace();
    } finally {
        if (forkJoinPool != null) {
            forkJoinPool.shutdown(); //always remember to shutdown the pool
        }
    }
    out.println(result);
    out.println(ThreadNameSet);
}

Here is the output saying the pool is using more threads than the default 4.

50000005000000
[ForkJoinPool-1-worker-8, ForkJoinPool-1-worker-9, ForkJoinPool-1-worker-6, ForkJoinPool-1-worker-11, ForkJoinPool-1-worker-10, ForkJoinPool-1-worker-1, ForkJoinPool-1-worker-15, ForkJoinPool-1-worker-13, ForkJoinPool-1-worker-4, ForkJoinPool-1-worker-2]

But actually there is a weirdo, when I tried to achieve the same result using ThreadPoolExecutor as follows:

BlockingDeque blockingDeque = new LinkedBlockingDeque(1000);
ThreadPoolExecutor fixedSizePool = new ThreadPoolExecutor(10, 20, 60, TimeUnit.SECONDS, blockingDeque, new MyThreadFactory("my-thread"));

but I failed.

It will only start the parallelStream in a new thread and then everything else is just the same, which again proves that the parallelStream will use the ForkJoinPool to start its child threads.

Solution 12 - Java

Here is how I set the max thread count flag mentioned above programatically and a code sniped to verify that the parameter is honored

System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "2");
Set<String> threadNames = Stream.iterate(0, n -> n + 1)
  .parallel()
  .limit(100000)
  .map(i -> Thread.currentThread().getName())
  .collect(Collectors.toSet());
System.out.println(threadNames);

// Output -> [ForkJoinPool.commonPool-worker-1, Test worker, ForkJoinPool.commonPool-worker-3]

Solution 13 - Java

If you don't mind using a third-party library, with cyclops-react you can mix sequential and parallel Streams within the same pipeline and provide custom ForkJoinPools. For example

 ReactiveSeq.range(1, 1_000_000)
            .foldParallel(new ForkJoinPool(10),
                          s->s.filter(i->true)
                              .peek(i->System.out.println("Thread " + Thread.currentThread().getId()))
                              .max(Comparator.naturalOrder()));

Or if we wished to continue processing within a sequential Stream

 ReactiveSeq.range(1, 1_000_000)
            .parallel(new ForkJoinPool(10),
                      s->s.filter(i->true)
                          .peek(i->System.out.println("Thread " + Thread.currentThread().getId())))
            .map(this::processSequentially)
            .forEach(System.out::println);

[Disclosure I am the lead developer of cyclops-react]

Solution 14 - Java

If you don't need a custom ThreadPool but you rather want to limit the number of concurrent tasks, you can use:

List<Path> paths = List.of("/path/file1.csv", "/path/file2.csv", "/path/file3.csv").stream().map(e -> Paths.get(e)).collect(toList());
List<List<Path>> partitions = Lists.partition(paths, 4); // Guava method

partitions.forEach(group -> group.parallelStream().forEach(csvFilePath -> {
       // do your processing   
}));

(Duplicate question asking for this is locked, so please bear me here)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLukasView Question on Stackoverflow
Solution 1 - JavaLukasView Answer on Stackoverflow
Solution 2 - JavaassyliasView Answer on Stackoverflow
Solution 3 - JavaMario FuscoView Answer on Stackoverflow
Solution 4 - JavaTod CasasentView Answer on Stackoverflow
Solution 5 - JavaKayVView Answer on Stackoverflow
Solution 6 - JavacharlieView Answer on Stackoverflow
Solution 7 - JavaStefan FerstlView Answer on Stackoverflow
Solution 8 - JavaScott LangleyView Answer on Stackoverflow
Solution 9 - JavaGrzegorz PiwowarekView Answer on Stackoverflow
Solution 10 - Javauser_3380739View Answer on Stackoverflow
Solution 11 - JavaHearenView Answer on Stackoverflow
Solution 12 - JavaBorislav StoilovView Answer on Stackoverflow
Solution 13 - JavaJohn McCleanView Answer on Stackoverflow
Solution 14 - JavaMartyIXView Answer on Stackoverflow