Selecting only numeric columns from a data frame

R

R Problem Overview


Suppose, you have a data.frame like this:

x <- data.frame(v1=1:20,v2=1:20,v3=1:20,v4=letters[1:20])

How would you select only those columns in x that are numeric?

R Solutions


Solution 1 - R

EDIT: updated to avoid use of ill-advised sapply.

Since a data frame is a list we can use the list-apply functions:

nums <- unlist(lapply(x, is.numeric), use.names = FALSE)  

Then standard subsetting

x[ , nums]

## don't use sapply, even though it's less code
## nums <- sapply(x, is.numeric)

For a more idiomatic modern R I'd now recommend

x[ , purrr::map_lgl(x, is.numeric)]

Less codey, less reflecting R's particular quirks, and more straightforward, and robust to use on database-back-ended tibbles:

dplyr::select_if(x, is.numeric)

Newer versions of dplyr, also support the following syntax:

x %>% dplyr::select(where(is.numeric))

Solution 2 - R

The dplyr package's select_if() function is an elegant solution:

library("dplyr")
select_if(x, is.numeric)

Solution 3 - R

Filter() from the base package is the perfect function for that use-case: You simply have to code:

Filter(is.numeric, x)

It is also much faster than select_if():

library(microbenchmark)
microbenchmark(
    dplyr::select_if(mtcars, is.numeric),
    Filter(is.numeric, mtcars)
)

returns (on my computer) a median of 60 microseconds for Filter, and 21 000 microseconds for select_if (350x faster).

Solution 4 - R

in case you are interested only in column names then use this :

names(dplyr::select_if(train,is.numeric))

Solution 5 - R

This an alternate code to other answers:

x[, sapply(x, class) == "numeric"]

with a data.table

x[, lapply(x, is.numeric) == TRUE, with = FALSE]

Solution 6 - R

iris %>% dplyr::select(where(is.numeric)) #as per most recent updates

Another option with purrr would be to negate discard function:

iris %>% purrr::discard(~!is.numeric(.))

If you want the names of the numeric columns, you can add names or colnames:

iris %>% purrr::discard(~!is.numeric(.)) %>% names

Solution 7 - R

library(purrr)
x <- x %>% keep(is.numeric)

Solution 8 - R

The library PCAmixdata has functon splitmix that splits quantitative(Numerical data) and qualitative (Categorical data) of a given dataframe "YourDataframe" as shown below:

install.packages("PCAmixdata")
library(PCAmixdata)
split <- splitmix(YourDataframe)
X1 <- split$X.quanti(Gives numerical columns in the dataset) 
X2 <- split$X.quali (Gives categorical columns in the dataset)

Solution 9 - R

If you have many factor variables, you can use select_if funtion. install the dplyr packages. There are many function that separates data by satisfying a condition. you can set the conditions.

Use like this.

categorical<-select_if(df,is.factor)
str(categorical)

Solution 10 - R

Another way could be as follows:-

#extracting numeric columns from iris datset
(iris[sapply(iris, is.numeric)])

Solution 11 - R

Numerical_variables <- which(sapply(df, is.numeric))
# then extract column names 
Names <- names(Numerical_variables)

Solution 12 - R

This doesn't directly answer the question but can be very useful, especially if you want something like all the numeric columns except for your id column and dependent variable.

numeric_cols <- sapply(dataframe, is.numeric) %>% which %>% 
                   names %>% setdiff(., c("id_variable", "dep_var"))

dataframe %<>% dplyr::mutate_at(numeric_cols, function(x) your_function(x))

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionBrandon BertelsenView Question on Stackoverflow
Solution 1 - RmdsumnerView Answer on Stackoverflow
Solution 2 - RSharonView Answer on Stackoverflow
Solution 3 - RKevin ZarcaView Answer on Stackoverflow
Solution 4 - Ruser3065757View Answer on Stackoverflow
Solution 5 - REnrique Pérez HerreroView Answer on Stackoverflow
Solution 6 - RAlexBView Answer on Stackoverflow
Solution 7 - RYash KhokaleView Answer on Stackoverflow
Solution 8 - Ruser1View Answer on Stackoverflow
Solution 9 - R서영재View Answer on Stackoverflow
Solution 10 - RAyushiView Answer on Stackoverflow
Solution 11 - RMohamed Ali HefnawyView Answer on Stackoverflow
Solution 12 - RRJMCMCView Answer on Stackoverflow