calculate the mean for each column of a matrix in R

RDataframeMean

R Problem Overview


I am working on R in R studio. I need to calculate the mean for each column of a data frame.

 cluster1  // 5 by 4 data frame
 mean(cluster1) // 

I got :

  Warning message:
  In mean.default(cluster1) :
  argument is not numeric or logical: returning NA

But I can use

  mean(cluster1[[1]])

to get the mean of the first column.

How to get means for all columns ?

Any help would be appreciated.

R Solutions


Solution 1 - R

You can use colMeans:

### Sample data
set.seed(1)
m <- data.frame(matrix(sample(100, 20, replace = TRUE), ncol = 4))

### Your error
mean(m)
# [1] NA
# Warning message:
# In mean.default(m) : argument is not numeric or logical: returning NA

### The result using `colMeans`
colMeans(m)
#   X1   X2   X3   X4 
# 47.0 64.4 44.8 67.8 

Solution 2 - R

You can use 'apply' to run a function or the rows or columns of a matrix or numerical data frame:

cluster1 <- data.frame(a=1:5, b=11:15, c=21:25, d=31:35)

apply(cluster1,2,mean)  # applies function 'mean' to 2nd dimension (columns)

apply(cluster1,1,mean)  # applies function to 1st dimension (rows)

sapply(cluster1, mean)  # also takes mean of columns, treating data frame like list of vectors

Solution 3 - R

In case you have NA's:

sapply(data, mean, na.rm = T)      # Returns a vector (with names)   
lapply(data, mean, na.rm = T)      # Returns a list  

Remember that "mean" needs numeric data. If you have mixed class data, then use:

numdata<-data[sapply(data, is.numeric)]  
sapply(numdata, mean, na.rm = T)  # Returns a vector
lapply(numdata, mean, na.rm = T)  # Returns a list  

Solution 4 - R

Another way is to use purrr package

# example data like what is said above

@A Handcart And Mohair

set.seed(1)
m <- data.frame(matrix(sample(100, 20, replace = TRUE), ncol = 4))


library(purrr)
means <- map_dbl(m, mean)

> means
#  X1   X2   X3   X4 
#47.0 64.4 44.8 67.8 

Solution 5 - R

You can try this:

mean(as.matrix(cluster1))

Solution 6 - R

try it ! also can calculate NA's data!

df <- data.frame(a1=1:10, a2=11:20)

df %>% summarise_each(funs( mean( .,na.rm = TRUE)))


# a1   a2
# 5.5 15.5

Solution 7 - R

class(mtcars)
my.mean <- unlist(lapply(mtcars, mean)); my.mean



   mpg        cyl       disp         hp       drat         wt       qsec         vs 
 20.090625   6.187500 230.721875 146.687500   3.596563   3.217250  17.848750   0.437500 
        am       gear       carb 
  0.406250   3.687500   2.812500 

Solution 8 - R

colMeans(A, na.rm = FALSE, dims = 1)

https://stat.ethz.ch/R-manual/R-devel/library/base/html/colSums.html

This is in the base class, so no library is required.

The first answer looks like it is using colMeans from the analytics library which is not available in the R version 4.0.2.

Solution 9 - R

For diversity: Another way is to converts a vector function to one that works with data frames by using plyr::colwise()

set.seed(1)
m <- data.frame(matrix(sample(100, 20, replace = TRUE), ncol = 4))

plyr::colwise(mean)(m)


#   X1   X2   X3   X4
# 1 47 64.4 44.8 67.8

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser2420472View Question on Stackoverflow
Solution 1 - RA5C1D2H2I1M1N2O1R2T1View Answer on Stackoverflow
Solution 2 - RbobView Answer on Stackoverflow
Solution 3 - RGonzalo user7334982View Answer on Stackoverflow
Solution 4 - Ruser6376316View Answer on Stackoverflow
Solution 5 - RweijiaView Answer on Stackoverflow
Solution 6 - RI Ju ChengView Answer on Stackoverflow
Solution 7 - RSeyma KalayView Answer on Stackoverflow
Solution 8 - RSysEngView Answer on Stackoverflow
Solution 9 - RAgaz WaniView Answer on Stackoverflow