How to round a data.frame in R that contains some character variables?

R

R Problem Overview


I have a dataframe, and I wish to round all of the numbers (ready for export). This must be straightforward, but I am having problems because some bits of the dataframe are not numeric numbers. For example I want to round the figures to the nearest whole number in the example below:

ID = c("a","b","c","d","e")
Value1 = c("3.4","6.4","8.7","1.1","0.1")
Value2 = c("8.2","1.7","6.4","1.9","10.3")
df<-data.frame(ID,Value1,Value2)

Can anyone help me out? I can round individual columns (e.g., round(df$Value1, 2)) but I want to round a whole table which contains some columns which are not numeric.

R Solutions


Solution 1 - R

I think the neatest way of doing this now is using dplyr

library(dplyr)
df %>% 
 mutate_if(is.numeric, round)

This will round all numeric columns in your dataframe

Solution 2 - R

Recognizing that this is an old question and one answer is accepted, I would like to offer another solution since the question appears as a top-ranked result on Google.

A more general solution is to create a separate function that searches for all numerical variables and rounds them to the specified number of digits:

round_df <- function(df, digits) {
  nums <- vapply(df, is.numeric, FUN.VALUE = logical(1))
  
  df[,nums] <- round(df[,nums], digits = digits)
  
  (df)
}

Once defined, you can use it as follows:

> round_df(df, digits=3)

Solution 3 - R

First make sure your number columns are numeric:

ID = c("a","b","c","d","e")
Value1 = as.numeric(c("3.4","6.4","8.7","1.1","0.1"))
Value2 = as.numeric(c("8.2","1.7","6.4","1.9","10.3"))
df<-data.frame(ID,Value1,Value2, stringsAsFactors = FALSE)

Then, round only the numeric columns:

df[,-1] <-round(df[,-1],0) #the "-1" excludes column 1
df

  ID Value1 Value2
1  a      3      8
2  b      6      2
3  c      9      6
4  d      1      2
5  e      0     10

Solution 4 - R

I know this is a late reply, but I also had this same problem. After doing some searching I found this to be the most elegant solution:

data.frame(lapply(x, function(y) if(is.numeric(y)) round(y, 2) else y)) 

Solution originally from: Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA

http://r.789695.n4.nabble.com/round-a-data-frame-containing-character-variables-td3732415.html

Solution 5 - R

Here is a one-liner that I like using: (this will apply the round function to only the columns of class type specified in the classes argument)

df2 <- rapply(object = df, f = round, classes = "numeric", how = "replace", digits = 0) 

Solution 6 - R

The other answers do not quite answer the OP's question exactly because they assume the example data is different from what the OP has provided.

If we read the question literally, and we want a general solution that will find columns with digits in them (of any vector type), convert them to numeric, and then perform another numeric operation, such as rounding. We can use purrr:dmap and do it like this:

Here's the data as provided by the OP, where all cols are factors (an annoying default, but we can deal with it):

ID = c("a","b","c","d","e")
Value1 = c("3.4","6.4","8.7","1.1","0.1")
Value2 = c("8.2","1.7","6.4","1.9","10.3")
df<-data.frame(ID,Value1,Value2)

str(df)
'data.frame':	5 obs. of  3 variables:
 $ ID    : Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5
 $ Value1: Factor w/ 5 levels "0.1","1.1","3.4",..: 3 4 5 2 1
 $ Value2: Factor w/ 5 levels "1.7","1.9","10.3",..: 5 1 4 2 3

We'll search for cols with digits in them, and make a dataframe of indices to mark the numerics:

library(dplyr)
library(purrr)

df_logical <- 
df %>% 
  dmap(function(i) grepl("[0-9]", i))

df_logical
     ID Value1 Value2
1 FALSE   TRUE   TRUE
2 FALSE   TRUE   TRUE
3 FALSE   TRUE   TRUE
4 FALSE   TRUE   TRUE
5 FALSE   TRUE   TRUE

str(df_logical)
'data.frame':	5 obs. of  3 variables:
 $ ID    : logi  FALSE FALSE FALSE FALSE FALSE
 $ Value1: logi  TRUE TRUE TRUE TRUE TRUE
 $ Value2: logi  TRUE TRUE TRUE TRUE TRUE

Then we can use these indices to select a subset of the cols in the original dataframe and convert them to numeric, and do other things also (in this case, rounding):

df_numerics <- 
map(1:ncol(df), function(i) ifelse(df_logical[,i], 
                                      as.numeric(as.character(df[,i])), 
                                      df[,i])) %>% 
  dmap(round, 0) %>% 
  setNames(names(df)) 

And we've got the desired result:

df_numerics
  ID Value1 Value2
1  1      3      8
2  2      6      2
3  3      9      6
4  4      1      2
5  5      0     10

str(df_numerics)
'data.frame':	5 obs. of  3 variables:
 $ ID    : num  1 2 3 4 5
 $ Value1: num  3 6 9 1 0
 $ Value2: num  8 2 6 2 10
 

This could be useful in the case of a dataframe with a large number of columns, and where we have many character/factor type cols full of digits that we want as numeric, but it's too tedious to do by hand.

Solution 7 - R

The answers above point out a couple of stumbling blocks in the initial question, that make it more complicated than just rounding multiple columns, primarily:

  1. Numbers were entered as characters, and
  2. data.frame() default converts the character-numbers to factors

The response by Ben details how to handle these issues, and applies purrr::dmap(). The purrr package has since been modified and the dmap function is deprecated (in favor of map_df()).
There is also a newer function, modify_if() which can solve the problem of rounding multiple numeric columns, and so I wanted to update this answer.


I'll enter the data as numbers, adding a few more digits to round to make the example more broadly applicable:

df <- data.frame(ID = c("a","b","c","d","e"), 
                 Value1 =c(3.4532897,6.41325,8.71235,1.115,0.115), 
                 Value2 = c(8.2125,1.71235,6.4135,1.915,10.3235))

Using the purrr::modify_if() function:

purrr::modify_if(df, ~is.numeric(.), ~round(., 0))

  ID Value1 Value2
1  a      3      8
2  b      6      2
3  c      9      6
4  d      1      2
5  e      0     10

just change to round(digits= 0) to the appropriate decimal spaces

modify_if(df, ~is.numeric(.), ~round(., 2))
  ID Value1 Value2
1  a   3.45   8.21
2  b   6.41   1.71
3  c   8.71   6.41
4  d   1.12   1.92
5  e   0.12  10.32

see http://purrr.tidyverse.org/ for further documentation on syntax

This could also be done in two steps using base R apply functions, by creating an index for the columns (numVars) and then standard indexing to modify only those columns:

numVars <-  sapply(df, is.numeric)
   ID Value1 Value2 
FALSE   TRUE   TRUE 

df[, numVars] <- lapply(df[, numVars], round, 0)
df
  ID Value1 Value2
1  a      3      8
2  b      6      2
3  c      9      6
4  d      1      2
5  e      0     10

Solution 8 - R

Note that some solutions proposed above do not take care of row names, meaning that they got lost.

For example, try:

df <- data.frame(v1 = seq(1.11, 1.20, 0.01), v2 = letters[1:10])
row.names(df) = df$v2

and then, as suggested above, try:

data.frame( lapply(df, function(y) if(is.numeric(y)) round(y, 2) else y) ) 

Note that the row names are no longer there.

Akhmed's suggestion keeps row names because it works with replacements.

Solution 9 - R

Why don't you just use ID as the row name?

... and take out the "'s from value1 and value2 data

Try this instead:

ID = c("a","b","c","d","e")
Value1 = c(3.4,6.4,8.7,1.1,0.1)
Value2 = c(8.2,1.7,6.4,1.9,10.3)

df<-data.frame(ID,Value1,Value2,row.names=TRUE)

> df
  Value1 Value2
a    3.4    8.2
b    6.4    1.7
c    8.7    6.4
d    1.1    1.9
e    0.1   10.3

> str(df)
'data.frame':   5 obs. of  2 variables:
 $ Value1: num  3.4 6.4 8.7 1.1 0.1
 $ Value2: num  8.2 1.7 6.4 1.9 10.3

I am not sure what you want to do with the round, but you have some options in R:

?ceiling()
?floor()
?trunc()

Solution 10 - R

Here is an alternative. This function makes it easy to specify the actual rounding function and accepts unique digits value for each column:

rounddf <- function(x, digits = rep(2, ncol(x)), func = round) {
  if (length(digits) == 1) {
    digits <- rep(digits, ncol(x))
  } else if (length(digits) != ncol(x)) {
    digits <- c(digits, rep(digits[1], ncol(x) - length(digits)))
    warning('First value in digits repeated to match length.')
  }

  for(i in 1:ncol(x)) {
    if(class(x[, i])[1] == 'numeric') x[, i] <- func(x[, i], digits[i])
  }

  return(x)
}

It's posted (and sometimes updated) at https://github.com/sashahafner/jumbled

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionKT_1View Question on Stackoverflow
Solution 1 - Ruser1165199View Answer on Stackoverflow
Solution 2 - RakhmedView Answer on Stackoverflow
Solution 3 - RPierre LapointeView Answer on Stackoverflow
Solution 4 - RAliView Answer on Stackoverflow
Solution 5 - RtrisaratopsView Answer on Stackoverflow
Solution 6 - RBenView Answer on Stackoverflow
Solution 7 - RMatt L.View Answer on Stackoverflow
Solution 8 - RRtistView Answer on Stackoverflow
Solution 9 - RGago-SilvaView Answer on Stackoverflow
Solution 10 - RsashahafnerView Answer on Stackoverflow