Add a prefix to column names

RDataframe

R Problem Overview


When reading the following helpfile it should be possible to add a prefix to the column names :

colnames(x, do.NULL = TRUE, prefix = "col")

The following doesn't work for me. What am I doing wrong here?

m2 <- cbind(1,1:4)
colnames(m2, do.NULL = FALSE)
colnames(m2) <- c("x","Y")
colnames(m2) <- colnames(m2, prefix = "Sub_")
colnames(m2)

R Solutions


Solution 1 - R

You have misread the help file. Here's the argument to look at:

do.NULL: logical. If FALSE and names are NULL, names are created.

Notice the and in that description. Your names are no longer NULL, so using prefix won't work.

Instead, use something like this:

> m2 <- cbind(1,1:4)
> colnames(m2) <- c("x","Y")
> colnames(m2) <- paste("Sub", colnames(m2), sep = "_")
> m2
     Sub_x Sub_Y
[1,]     1     1
[2,]     1     2
[3,]     1     3
[4,]     1     4

Solution 2 - R

I will add a tidyverse approach to this problem, for which you can both add suffix and prefix to all column names. The following adds a prefix in a dplyr pipe.

dplyr 1.0.2 and beyond
library(dplyr)
df <- data.frame(x = c(1, 2), y = c(3, 4))

## Adding prefixes
df %>% rename_with( ~ paste0("a", .x))

## Adding suffixes
df %>% rename_with( ~ paste0(.x, "a"))

If you want to have separators such as underline, you can use paste as well with the sep argument.


Before dplyr 1.0.2 update
library(dplyr)
df <- data.frame(x = c(1, 2), y = c(3, 4))
df %>% rename_all( ~ paste0("a", .x))

Adding suffix is easier.

df %>% rename_all(paste0, "a")

Solution 3 - R

The updated tidyverse method (with dplyr 1.0.2) uses rename_with() as the rename_all() function has been superseded.

iris %>% rename_with( ~ paste("Sub", .x, sep = "_"))

Solution 4 - R

The stats::setNames function works just fine for this, and is in fact much quicker than the alternatives.


iris.dt <- data.table::as.data.table(iris)

microbenchmark::microbenchmark(
  
  base = colnames(iris) <- paste("Sub", colnames(iris), sep = "_"),
  stats = setNames(iris, paste("Sub", colnames(iris), sep = "_")), 
  dplyr = dplyr::rename_with(iris, ~ paste("Sub", .x, sep = "_")),
  datatable = data.table::setnames(iris.dt, paste("Sub", names(iris.dt), sep = "_"))
  
)
#> Unit: microseconds
#>       expr     min       lq       mean   median      uq        max neval cld
#>       base  11.094  16.2140   21.62408  19.2010  23.681     65.707   100   a
#>      stats   8.107  13.8670   17.40435  16.6405  19.841     39.254   100   a
#>      dplyr 786.772 842.8785 5236.67222 877.0130 984.959 402378.407   100   a
#>  datatable  40.961  49.9200   84.06237  62.2935  73.600    834.560   100   a

Created on 2020-10-21 by the reprex package (v0.3.0)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJonas TundoView Question on Stackoverflow
Solution 1 - RA5C1D2H2I1M1N2O1R2T1View Answer on Stackoverflow
Solution 2 - RKimView Answer on Stackoverflow
Solution 3 - RRoss IrelandView Answer on Stackoverflow
Solution 4 - RJWillimanView Answer on Stackoverflow