Use a value from the previous row in an R data.table calculation

Rdata.table

R Problem Overview


I want to create a new column in a data.table calculated from the current value of one column and the previous of another. Is it possible to access previous rows?

E.g.:

> DT <- data.table(A=1:5, B=1:5*10, C=1:5*100)
> DT
   A  B   C
1: 1 10 100
2: 2 20 200
3: 3 30 300
4: 4 40 400
5: 5 50 500
> DT[, D := C + BPreviousRow] # What is the correct code here?

The correct answer should be

> DT
   A  B   C   D
1: 1 10 100  NA
2: 2 20 200 210
3: 3 30 300 320
4: 4 40 400 430
5: 5 50 500 540

R Solutions


Solution 1 - R

With shift() implemented in v1.9.6, this is quite straightforward.

DT[ , D := C + shift(B, 1L, type="lag")]
# or equivalently, in this case,
DT[ , D := C + shift(B)]

From NEWS:

> 5. New function shift() implements fast lead/lag of vector, list, data.frames or data.tables. It takes a type argument which can be either "lag" (default) or "lead". It enables very convenient usage along with := or set(). For example: DT[, (cols) := shift(.SD, 1L), by=id]. Please have a look at ?shift for more info.


See history for previous answers.

Solution 2 - R

Using dplyr you could do:

mutate(DT, D = lag(B) + C)

Which gives:

#   A  B   C   D
#1: 1 10 100  NA
#2: 2 20 200 210
#3: 3 30 300 320
#4: 4 40 400 430
#5: 5 50 500 540

Solution 3 - R

Several folks have answered the specific question. See the code below for a general purpose function that I use in situations like this that may be helpful. Rather than just getting the prior row, you can go as many rows in the "past" or "future" as you'd like.

rowShift <- function(x, shiftLen = 1L) {
  r <- (1L + shiftLen):(length(x) + shiftLen)
  r[r<1] <- NA
  return(x[r])
}

# Create column D by adding column C and the value from the previous row of column B:
DT[, D := C + rowShift(B,-1)]

# Get the Old Faithul eruption length from two events ago, and three events in the future:
as.data.table(faithful)[1:5,list(eruptLengthCurrent=eruptions,
                                 eruptLengthTwoPrior=rowShift(eruptions,-2), 
                                 eruptLengthThreeFuture=rowShift(eruptions,3))]
##   eruptLengthCurrent eruptLengthTwoPrior eruptLengthThreeFuture
##1:              3.600                  NA                  2.283
##2:              1.800                  NA                  4.533
##3:              3.333               3.600                     NA
##4:              2.283               1.800                     NA
##5:              4.533               3.333                     NA

Solution 4 - R

Based on @Steve Lianoglou 's comment above, why not just:

DT[, D:= C + c(NA, B[.I - 1]) ]
#    A  B   C   D
# 1: 1 10 100  NA
# 2: 2 20 200 210
# 3: 3 30 300 320
# 4: 4 40 400 430
# 5: 5 50 500 540

And avoid using seq_len or head or any other function.

Solution 5 - R

Following Arun's solution, a similar results can be obtained without referring to to .N

> DT[, D := C + c(NA, head(B, -1))][]
   A  B   C   D
1: 1 10 100  NA
2: 2 20 200 210
3: 3 30 300 320
4: 4 40 400 430
5: 5 50 500 540

Solution 6 - R

Here is my intuitive solution:

#create data frame
df <- data.frame(A=1:5, B=seq(10,50,10), C=seq(100,500, 100))`
#subtract the shift from num rows
shift  <- 1 #in this case the shift is 1
invshift <- nrow(df) - shift
#Now create the new column
df$D <- c(NA, head(df$B, invshift)+tail(df$C, invshift))`

Here invshift, the number of rows minus 1, is 4. nrow(df) provides you with the number of rows in a data frame or in a vector. Similarly, if you want to take still earlier values, subtract from nrow 2, 3, ...etc, and also put NA's accordingly at the beginning.

Solution 7 - R

it can be done in a loop.

# Create the column D
DT$D <- 0
# for every row in DT
for (i in 1:length(DT$A)) {
  if(i==1) {
    #using NA at first line
    DT[i,4] <- NA
  } else {
    #D = C + BPreviousRow
    DT[i,4] <- DT[i,3] + DT[(i-1), 2]   
  }
}

Using a for, you can even use the previous value of the row of this new column DT[(i-1), 4]

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionCorvusView Question on Stackoverflow
Solution 1 - RArunView Answer on Stackoverflow
Solution 2 - RSteven BeaupréView Answer on Stackoverflow
Solution 3 - RdnlbrkyView Answer on Stackoverflow
Solution 4 - RGary WeissmanView Answer on Stackoverflow
Solution 5 - RRyogiView Answer on Stackoverflow
Solution 6 - RAbdullah Al MahmudView Answer on Stackoverflow
Solution 7 - RRafael BragaView Answer on Stackoverflow