R package that automatically uses several cores?


R Problem Overview

I have noticed that R only uses one core while executing one of my programs which requires lots of calculations. I would like to take advantage of my multi-core processor to make my program run faster. I have not yet investigated the question in depth but I would appreciate to benefit from your comments because I do not have good knowledge in computer science and it is difficult for me to get easily understandable information on that subject.

Is there a package that allows R to automatically use several cores when needed?

I guess it is not that simple.

R Solutions

Solution 1 - R

R can only make use of multiple cores with the help of add-on packages, and only for some types of operation. The options are discussed in detail on the High Performance Computing Task View on CRAN

Update: From R Version 2.14.0 add-on packages are not necessarily required due to the inclusion of the parallel package as a recommended package shipped with R. parallel includes functionality from the multicore and snow packages, largely unchanged.

Solution 2 - R

The easiest way to take advantage of multiprocessors is the multicore package which includes the function mclapply(). mclapply() is a multicore version of lapply(). So any process that can use lapply() can be easily converted to an mclapply() process. However, multicore does not work on Windows. I wrote a blog post about this last year which might be helpful. The package Revolution Analytics created, doSMP, is NOT a multi-threaded version of R. It's effectively a Windows version of multicore.

If your work is embarrassingly parallel, it's a good idea to get comfortable with the lapply() type of structuring. That will give you easy segue into mclapply() and even distributed computing using the same abstraction.

Things get much more difficult for operations that are not "embarrassingly parallel".


As a side note, Rstudio is getting increasingly popular as a front end for R. I love Rstudio and use it daily. However it needs to be noted that Rstudio does not play nice with Multicore (at least as of Oct 2011... I understand that the RStudio team is going to fix this). This is because Rstudio does some forking behind the scenes and these forks conflict with Multicore's attempts to fork. So if you need Multicore, you can write your code in Rstuido, but run it in a plain-Jane R session.

Solution 3 - R

On this question you always get very short answers. The easiest solution according to me is the package snowfall, based on snow. That is, on a Windows single computer with multiple cores. See also here the article of Knaus et al for a simple example. Snowfall is a wrapper around the snow package, and allows you to setup a multicore with a few commands. It's definitely less hassle than most of the other packages (I didn't try all of them).

On a sidenote, there are indeed only few tasks that can be parallelized, for the very simple reason that you have to be able to split up the tasks before multicore calculation makes sense. the apply family is obviously a logical choice for this : multiple and independent computations, which is crucial for multicore use. Anything else is not always that easily multicored.

Read also this discussion on sfApply and custom functions.

Solution 4 - R

Microsoft R Open includes multi-threaded math libraries to improve the performance of R.It works in Windows/Unix/Mac all OS type. It's open source and can be installed in a separate directory if you have any existing R(from CRAN) installation. You can use popular IDE Rstudio also with this.From its inception, R was designed to use only a single thread (processor) at a time. Even today, R works that way unless linked with multi-threaded BLAS/LAPACK libraries.

The multi-core machines of today offer parallel processing power. To take advantage of this, Microsoft R Open includes multi-threaded math libraries. These libraries make it possible for so many common R operations, such as matrix multiply/inverse, matrix decomposition, and some higher-level matrix operations, to compute in parallel and use all of the processing power available to reduce computation times.

Please check the below link:



Solution 5 - R

As David Heffernan said, take a look at the Blog of revolution Analytics. But you should know that most packages are for Linux. So, if you use windows it will be much harder. Anyway, take a look at these sites:

http://blog.revolutionanalytics.com/2010/08/taking-r-to-the-limit-parallelism-and-big-data.html">Revolution</a>;. Here you will find a lecture about parallerization in R. The lecture is actually very good, but, as I said, most tips are for Linux.

And this thread here at https://stackoverflow.com/questions/4775098/r-with-a-multi-core-processor"> Stackoverflow will disscuss some implementation in Windows.

Solution 6 - R

The package future makes it extremely simple to work in R using parallel and distributed processing. More info here. If you want to apply a function to elements in parallel, the future.apply package provides a quick way to use the "apply" family functions (e.g. apply(), lapply(), and vapply()) in parallel.


x <- 1:10

# Single core
  y <- lapply(x, FUN = quantile, probs = 1:3/4)

# Multicore in parallel
  y <- future_lapply(x, FUN = quantile, probs = 1:3/4)


All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMarcoView Question on Stackoverflow
Solution 1 - RGavin SimpsonView Answer on Stackoverflow
Solution 2 - RJD LongView Answer on Stackoverflow
Solution 3 - RJoris MeysView Answer on Stackoverflow
Solution 4 - RpmrView Answer on Stackoverflow
Solution 5 - RManoel GaldinoView Answer on Stackoverflow
Solution 6 - Rrafa.pereiraView Answer on Stackoverflow