Repeat rows of a data.frame N times

RDataframe

R Problem Overview


I have the following data frame:

data.frame(a = c(1,2,3),b = c(1,2,3))
  a b
1 1 1
2 2 2
3 3 3

I want to repeat the rows n times. For example, here the rows are repeated 3 times:

  a b
1 1 1
2 2 2
3 3 3
4 1 1
5 2 2
6 3 3
7 1 1
8 2 2
9 3 3

Is there an easy function to do this in R? Thanks!

R Solutions


Solution 1 - R

EDIT: updated to a better modern R answer.

You can use replicate(), then rbind the result back together. The rownames are automatically altered to run from 1:nrows.

d <- data.frame(a = c(1,2,3),b = c(1,2,3))
n <- 3
do.call("rbind", replicate(n, d, simplify = FALSE))

A more traditional way is to use indexing, but here the rowname altering is not quite so neat (but more informative):

 d[rep(seq_len(nrow(d)), n), ]

Here are improvements on the above, the first two using purrr functional programming, idiomatic purrr:

purrr::map_dfr(seq_len(3), ~d)

and less idiomatic purrr (identical result, though more awkward):

purrr::map_dfr(seq_len(3), function(x) d)

and finally via indexing rather than list apply using dplyr:

d %>% slice(rep(row_number(), 3))

Solution 2 - R

For data.frame objects, this solution is several times faster than @mdsummer's and @wojciech-sobala's.

d[rep(seq_len(nrow(d)), n), ]

For data.table objects, @mdsummer's is a bit faster than applying the above after converting to data.frame. For large n this might flip. microbenchmark.

Full code:

packages <- c("data.table", "ggplot2", "RUnit", "microbenchmark")
lapply(packages, require, character.only=T)

Repeat1 <- function(d, n) {
  return(do.call("rbind", replicate(n, d, simplify = FALSE)))
}

Repeat2 <- function(d, n) {
  return(Reduce(rbind, list(d)[rep(1L, times=n)]))
}

Repeat3 <- function(d, n) {
  if ("data.table" %in% class(d)) return(d[rep(seq_len(nrow(d)), n)])
  return(d[rep(seq_len(nrow(d)), n), ])
}

Repeat3.dt.convert <- function(d, n) {
  if ("data.table" %in% class(d)) d <- as.data.frame(d)
  return(d[rep(seq_len(nrow(d)), n), ])
}

# Try with data.frames
mtcars1 <- Repeat1(mtcars, 3)
mtcars2 <- Repeat2(mtcars, 3)
mtcars3 <- Repeat3(mtcars, 3)

checkEquals(mtcars1, mtcars2)
#  Only difference is row.names having ".k" suffix instead of "k" from 1 & 2
checkEquals(mtcars1, mtcars3)

# Works with data.tables too
mtcars.dt <- data.table(mtcars)
mtcars.dt1 <- Repeat1(mtcars.dt, 3)
mtcars.dt2 <- Repeat2(mtcars.dt, 3)
mtcars.dt3 <- Repeat3(mtcars.dt, 3)

# No row.names mismatch since data.tables don't have row.names
checkEquals(mtcars.dt1, mtcars.dt2)
checkEquals(mtcars.dt1, mtcars.dt3)

# Time test
res <- microbenchmark(Repeat1(mtcars, 10),
                      Repeat2(mtcars, 10),
                      Repeat3(mtcars, 10),
                      Repeat1(mtcars.dt, 10),
                      Repeat2(mtcars.dt, 10),
                      Repeat3(mtcars.dt, 10),
                      Repeat3.dt.convert(mtcars.dt, 10))
print(res)
ggsave("repeat_microbenchmark.png", autoplot(res))

Solution 3 - R

The package dplyr contains the function bind_rows() that directly combines all data frames in a list, such that there is no need to use do.call() together with rbind():

df <- data.frame(a = c(1, 2, 3), b = c(1, 2, 3))
library(dplyr)
bind_rows(replicate(3, df, simplify = FALSE))

For a large number of repetions bind_rows() is also much faster than rbind():

library(microbenchmark)
microbenchmark(rbind = do.call("rbind", replicate(1000, df, simplify = FALSE)),
               bind_rows = bind_rows(replicate(1000, df, simplify = FALSE)),
               times = 20)
## Unit: milliseconds
##       expr       min        lq      mean   median        uq       max neval cld
##      rbind 31.796100 33.017077 35.436753 34.32861 36.773017 43.556112    20   b
##  bind_rows  1.765956  1.818087  1.881697  1.86207  1.898839  2.321621    20  a 

Solution 4 - R

With the [tag:data.table]-package, you could use the special symbol .I together with rep:

df <- data.frame(a = c(1,2,3), b = c(1,2,3))
dt <- as.data.table(df)

n <- 3

dt[rep(dt[, .I], n)]

which gives:

> a b > 1: 1 1 > 2: 2 2 > 3: 3 3 > 4: 1 1 > 5: 2 2 > 6: 3 3 > 7: 1 1 > 8: 2 2 > 9: 3 3

Solution 5 - R

d <- data.frame(a = c(1,2,3),b = c(1,2,3))
r <- Reduce(rbind, list(d)[rep(1L, times=3L)])

Solution 6 - R

Even simpler:

library(data.table)
my_data <- data.frame(a = c(1,2,3),b = c(1,2,3))
rbindlist(replicate(n = 3, expr = my_data, simplify = FALSE)

Solution 7 - R

Just use simple indexing with repeat function.

mydata<-data.frame(a = c(1,2,3),b = c(1,2,3)) #creating your data frame  
n<-10           #defining no. of time you want repetition of the rows of your dataframe

mydata<-mydata[rep(rownames(mydata),n),] #use rep function while doing indexing 
rownames(mydata)<-1:NROW(mydata)    #rename rows just to get cleaner look of data

Solution 8 - R

For time execution purposes, i would like to suggest a comparison of different way of rbind:

> mydata <- data.frame(a=1:200,b=201:400,c=301:500)
> microbenchmark(rbind = do.call("rbind",replicate(n=100,mydata,simplify = FALSE)),
+                bind_rows = bind_rows(replicate(n=100,mydata,simplify = FALSE)),
+                rbindlist = rbindlist(replicate(n=100,exp= mydata,simplify = FALSE)),
+                times= 2000)
Unit: microseconds
      expr    min      lq      mean  median      uq      max neval
     rbind 5760.7 6723.10 8642.6930 7132.30 7761.05 240720.3  2000
 bind_rows  976.4 1186.90 1430.7741 1308.85 1469.80  15817.9  2000
 rbindlist  263.6  347.85  465.5894  392.90  459.95  10974.2  2000

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMichaelView Question on Stackoverflow
Solution 1 - RmdsumnerView Answer on Stackoverflow
Solution 2 - RMax GhenisView Answer on Stackoverflow
Solution 3 - RStibuView Answer on Stackoverflow
Solution 4 - RJaapView Answer on Stackoverflow
Solution 5 - RWojciech SobalaView Answer on Stackoverflow
Solution 6 - RArturo SbrView Answer on Stackoverflow
Solution 7 - Ruser5433002View Answer on Stackoverflow
Solution 8 - RA. chahidView Answer on Stackoverflow