Merging multiple data.tables

RJoinMergedata.table

R Problem Overview


I am aware that one can merge (join) two data.table with the merge function or the [.data.table function. However, if I have say 10, data.tables and want to use do.call to merge them all together, is there a function that would do this? Currently I resort to do.call(cbind, ...) which works for only very special cases.

R Solutions


Solution 1 - R

Not sure, but maybe (untested) :

Reduce(merge,list(DT1,DT2,DT3,...))

Solution 2 - R

To flesh out the solution suggested by @Henk in comments, here is how to use this to select the all = TRUE option in a concise formulation:

MergedDT = Reduce(function(...) merge(..., all = TRUE), List_of_DTs)

Solution 3 - R

The latest version of data.table (1.10.5) has an easy syntax for table join, which can also be chained.

 set(DT1, key)
 set(DT2, key) 
  ...
 set(DT10, key)

 DT_cmb <- DT1[DT2,][DT3,]...[DT10,]

Solution 4 - R

To fresh answer from Michael Ohlrogge, here I create a function to do this task. You can inherit more arguments from merge.data.table()

mergeDTs <- function(dt_list, by = NULL, sort = FALSE) {
  Reduce(
    function(...) {
      merge(..., by = by, all = TRUE, sort = sort)
    }, dt_list)
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAlexView Question on Stackoverflow
Solution 1 - RMatt DowleView Answer on Stackoverflow
Solution 2 - RMichael OhlroggeView Answer on Stackoverflow
Solution 3 - RHappyCodingView Answer on Stackoverflow
Solution 4 - RShixiang WangView Answer on Stackoverflow