Difference between Fork/Join and Map/Reduce

MapreduceFork Join

Mapreduce Problem Overview


What is the key difference between Fork/Join and Map/Reduce?

Do they differ in the kind of decomposition and distribution (data vs. computation)?

Mapreduce Solutions


Solution 1 - Mapreduce

One key difference is that F-J seems to be designed to work on a single Java VM, while M-R is explicitly designed to work on a large cluster of machines. These are very different scenarios.

F-J offers facilities to partition a task into several subtasks, in a recursive-looking fashion; more tiers, possibility of 'inter-fork' communication at this stage, much more traditional programming. Does not extend (at least in the paper) beyond a single machine. Great for taking advantage of your eight-core.

M-R only does one big split, with the mapped splits not talking between each other at all, and then reduces everything together. A single tier, no inter-split communication until reduce, and massively scalable. Great for taking advantage of your share of the cloud.

Solution 2 - Mapreduce

There is a whole scientific paper on the subject, Comparing Fork/Join and MapReduce.

The paper compares the performance, scalability and programmability of three parallel paradigms: fork/join, MapReduce, and a hybrid approach.

What they find is basically that Java fork/join has low startup latency and scales well for small inputs (<5MB), but it cannot process larger inputs due to the size restrictions of shared-memory, single node architectures. On the other hand, MapReduce has significant startup latency (tens of seconds), but scales well for much larger inputs (>100MB) on a compute cluster.

But there is a lot more to read there if you're up for it.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionhotzenView Question on Stackoverflow
Solution 1 - MapreducetucuxiView Answer on Stackoverflow
Solution 2 - MapreducePer Quested AronssonView Answer on Stackoverflow