Calculate row means on subset of columns

RDataframe

R Problem Overview


Given a sample data frame:

C1<-c(3,2,4,4,5)
C2<-c(3,7,3,4,5)
C3<-c(5,4,3,6,3)
DF<-data.frame(ID=c("A","B","C","D","E"),C1=C1,C2=C2,C3=C3)

DF
    ID C1 C2 C3
  1  A  3  3  5
  2  B  2  7  4
  3  C  4  3  3
  4  D  4  4  6
  5  E  5  5  3

What is the best way to create a second data frame that would contain the ID column and the mean of each row? Something like this:

ID  Mean
A    3.66
B    4.33
C    3.33
D    4.66
E    4.33

Something similar to:

RM<-rowMeans(DF[,2:4])

I'd like to keep the means aligned with their ID's.

R Solutions


Solution 1 - R

Calculate row means on a subset of columns:

Create a new data.frame which specifies the first column from DF as an column called ID and calculates the mean of all the other fields on that row, and puts that into column entitled 'Means':

data.frame(ID=DF[,1], Means=rowMeans(DF[,-1]))
  ID    Means
1  A 3.666667
2  B 4.333333
3  C 3.333333
4  D 4.666667
5  E 4.333333

Solution 2 - R

Starting with your data frame DF, you could use the data.table package:

library(data.table)

## EDIT: As suggested by @MichaelChirico, setDT converts a
## data.frame to a data.table by reference and is preferred
## if you don't mind losing the data.frame
setDT(DF)

# EDIT: To get the column name 'Mean':

DF[, .(Mean = rowMeans(.SD)), by = ID]

#      ID     Mean
# [1,]  A 3.666667
# [2,]  B 4.333333
# [3,]  C 3.333333
# [4,]  D 4.666667
# [5,]  E 4.333333

Solution 3 - R

You can create a new row with $ in your data frame corresponding to the Means

DF$Mean <- rowMeans(DF[,2:4])

Solution 4 - R

Using dplyr:

library(dplyr)

# exclude ID column then get mean
DF %>%
  transmute(ID,
            Mean = rowMeans(select(., -ID)))

Or

# select the columns to include in mean
DF %>%
  transmute(ID,
            Mean = rowMeans(select(., C1:C3)))

#   ID     Mean
# 1  A 3.666667
# 2  B 4.333333
# 3  C 3.333333
# 4  D 4.666667
# 5  E 4.333333

Solution 5 - R

(Another solution using pivot_longer & pivot_wider from latest Tidyr update)

You should try using pivot_longer to get your data from wide to long form Read latest tidyR update on pivot_longer & pivot_wider (https://tidyr.tidyverse.org/articles/pivot.html)

library(tidyverse)
C1<-c(3,2,4,4,5)
C2<-c(3,7,3,4,5)
C3<-c(5,4,3,6,3)
DF<-data.frame(ID=c("A","B","C","D","E"),C1=C1,C2=C2,C3=C3)

Output here

  ID     mean
  <fct> <dbl>
1 A      3.67
2 B      4.33
3 C      3.33
4 D      4.67
5 E      4.33

Solution 6 - R

rowMeans is nice, but if you are still trying to wrap your head around the apply family of functions, this is a good opprotunity to begin understanding it.

DF['Mean'] <- apply(DF[,2:4], 1, mean)

Notice I'm doing a slightly different assignment than the first example. This approach makes it easier to incorporate it into for loops.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionVinterwooView Question on Stackoverflow
Solution 1 - RJilber UrbinaView Answer on Stackoverflow
Solution 2 - RBenBarnesView Answer on Stackoverflow
Solution 3 - RNadegeliaView Answer on Stackoverflow
Solution 4 - Rzx8754View Answer on Stackoverflow
Solution 5 - Ruser12059497View Answer on Stackoverflow
Solution 6 - RMadmanLeeView Answer on Stackoverflow