How to find the highest value of a column in a data frame in R?

RMax

R Problem Overview


I have the following data frame which I called ozone:

   Ozone Solar.R Wind Temp Month Day
1     41     190  7.4   67     5   1
2     36     118  8.0   72     5   2
3     12     149 12.6   74     5   3
4     18     313 11.5   62     5   4
5     NA      NA 14.3   56     5   5
6     28      NA 14.9   66     5   6
7     23     299  8.6   65     5   7
8     19      99 13.8   59     5   8
9      8      19 20.1   61     5   9

I would like to extract the highest value from ozone, Solar.R, Wind...

Also, if possible how would I sort Solar.R or any column of this data frame in descending order

I tried

max(ozone, na.rm=T)

which gives me the highest value in the dataset.

I have also tried

max(subset(ozone,Ozone))

but got "subset" must be logical."

I can set an object to hold the subset of each column, by the following commands

ozone <- subset(ozone, Ozone >0)
max(ozone,na.rm=T) 

but it gives the same value of 334, which is the max value of the data frame, not the column.

Any help would be great, thanks.

R Solutions


Solution 1 - R

Similar to colMeans, colSums, etc, you could write a column maximum function, colMax, and a column sort function, colSort.

colMax <- function(data) sapply(data, max, na.rm = TRUE)
colSort <- function(data, ...) sapply(data, sort, ...)

I use ... in the second function in hopes of sparking your intrigue.

Get your data:

dat <- read.table(h=T, text = "Ozone Solar.R Wind Temp Month Day
1     41     190  7.4   67     5   1
2     36     118  8.0   72     5   2
3     12     149 12.6   74     5   3
4     18     313 11.5   62     5   4
5     NA      NA 14.3   56     5   5
6     28      NA 14.9   66     5   6
7     23     299  8.6   65     5   7
8     19      99 13.8   59     5   8
9      8      19 20.1   61     5   9")

Use colMax function on sample data:

colMax(dat)
#  Ozone Solar.R    Wind    Temp   Month     Day 
#   41.0   313.0    20.1    74.0     5.0     9.0

To do the sorting on a single column,

sort(dat$Solar.R, decreasing = TRUE)
# [1] 313 299 190 149 118  99  19

and over all columns use our colSort function,

colSort(dat, decreasing = TRUE) ## compare with '...' above

Solution 2 - R

To get the max of any column you want something like:

max(ozone$Ozone, na.rm = TRUE)

To get the max of all columns, you want:

apply(ozone, 2, function(x) max(x, na.rm = TRUE))

And to sort:

ozone[order(ozone$Solar.R),]

Or to sort the other direction:

ozone[rev(order(ozone$Solar.R)),]

Solution 3 - R

Here's a dplyr solution:

library(dplyr)

# find max for each column
summarise_each(ozone, funs(max(., na.rm=TRUE)))

# sort by Solar.R, descending
arrange(ozone, desc(Solar.R))

UPDATE: summarise_each() has been deprecated in favour of a more featureful family of functions: mutate_all(), mutate_at(), mutate_if(), summarise_all(), summarise_at(), summarise_if()

Here is how you could do:

# find max for each column
ozone %>%
         summarise_if(is.numeric, funs(max(., na.rm=TRUE)))%>%
         arrange(Ozone)

or

ozone %>%
         summarise_at(vars(1:6), funs(max(., na.rm=TRUE)))%>%
         arrange(Ozone)

Solution 4 - R

In response to finding the max value for each column, you could try using the apply() function:

> apply(ozone, MARGIN = 2, function(x) max(x, na.rm=TRUE))
  Ozone Solar.R    Wind    Temp   Month     Day 
   41.0   313.0    20.1    74.0     5.0     9.0 

Solution 5 - R

Another way would be to use ?pmax

do.call('pmax', c(as.data.frame(t(ozone)),na.rm=TRUE))
#[1]  41.0 313.0  20.1  74.0   5.0   9.0

Solution 6 - R

max(may$Ozone, na.rm = TRUE)

Without $Ozone it will filter in the whole data frame, this can be learned in the swirl library.

I'm studying this course on Coursera too ~

Solution 7 - R

There is a package matrixStats that provides some functions to do column and row summaries, see in the package [vignette][1], but you have to convert your data.frame into a matrix.

Then you run: colMaxs(as.matrix(ozone))

[1]: https://cran.r-project.org/web/packages/matrixStats/vignettes/matrixStats-methods.html "matrixStats package Vignette"

Solution 8 - R

Assuming that your data in data.frame called maxinozone, you can do this

max(maxinozone[1, ], na.rm = TRUE)

Solution 9 - R

max(ozone$Ozone, na.rm = TRUE) should do the trick. Remember to include the na.rm = TRUE or else R will return NA.

Solution 10 - R

Try this solution:

Oz<-subset(data, data$Month==5,select=Ozone) # select ozone  value in the month of                 
                                             #May (i.e. Month = 5)
summary(T)                                   #gives caracteristics of table( contains 1 column of Ozone) including max, min ...

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAlfonso VergaraView Question on Stackoverflow
Solution 1 - RRich ScrivenView Answer on Stackoverflow
Solution 2 - RWheresTheAnyKeyView Answer on Stackoverflow
Solution 3 - RrrsView Answer on Stackoverflow
Solution 4 - RccapizzanoView Answer on Stackoverflow
Solution 5 - RakrunView Answer on Stackoverflow
Solution 6 - RmarkcoddView Answer on Stackoverflow
Solution 7 - Reddy85brView Answer on Stackoverflow
Solution 8 - RimharindersinghView Answer on Stackoverflow
Solution 9 - RWallyTaylorView Answer on Stackoverflow
Solution 10 - RS.ElBahloulView Answer on Stackoverflow