Replace characters from a column of a data frame R

RReplaceDataframe

R Problem Overview


I have a data frame

a <- runif (10)
b <- letters [1:10]
c <- c(rep ("A-B", 4), rep("A_C", 6))
data1 <- data.frame (a, b, c)
data1

And I wan to replace _ in A_C of column c for - to have a data frame like data2:

z <- c(rep ("A-B", 4), rep("A-C", 6))
data2 <- data.frame (a, b, z)
data2

Do you know how I can do that?

R Solutions


Solution 1 - R

Use gsub:

data1$c <- gsub('_', '-', data1$c)
data1

            a b   c
1  0.34597094 a A-B
2  0.92791908 b A-B
3  0.30168772 c A-B
4  0.46692738 d A-B
5  0.86853784 e A-C
6  0.11447618 f A-C
7  0.36508645 g A-C
8  0.09658292 h A-C
9  0.71661842 i A-C
10 0.20064575 j A-C

Solution 2 - R

If your variable data1$c is a factor, it's more efficient to change the labels of the factor levels than to create a new vector of characters:

levels(data1$c) <- sub("_", "-", levels(data1$c))


            a b   c
1  0.73945260 a A-B
2  0.75998815 b A-B
3  0.19576725 c A-B
4  0.85932140 d A-B
5  0.80717115 e A-C
6  0.09101492 f A-C
7  0.10183586 g A-C
8  0.97742424 h A-C
9  0.21364521 i A-C
10 0.02389782 j A-C

Solution 3 - R

You can use the stringr library:

library('stringr')

a <- runif(10)
b <- letters[1:10]
c <- c(rep('A-B', 4), rep('A_B', 6))
data <- data.frame(a, b, c)

data

#             a b   c
# 1  0.19426707 a A-B
# 2  0.12902673 b A-B
# 3  0.78324955 c A-B
# 4  0.06469028 d A-B
# 5  0.34752264 e A_C
# 6  0.55313288 f A_C
# 7  0.31264280 g A_C
# 8  0.33759921 h A_C
# 9  0.72322599 i A_C
# 10 0.25223075 j A_C

data$c <- str_replace_all(data$c, '_', '-')

data

#             a b   c
# 1  0.19426707 a A-B
# 2  0.12902673 b A-B
# 3  0.78324955 c A-B
# 4  0.06469028 d A-B
# 5  0.34752264 e A-C
# 6  0.55313288 f A-C
# 7  0.31264280 g A-C
# 8  0.33759921 h A-C
# 9  0.72322599 i A-C
# 10 0.25223075 j A-C

Note that this does change factored variables into character.

Solution 4 - R

chartr is also convenient for these types of substitutions:

chartr("_", "-", data1$c)
#  [1] "A-B" "A-B" "A-B" "A-B" "A-C" "A-C" "A-C" "A-C" "A-C" "A-C"

Thus, you can just do:

data1$c <- chartr("_", "-", data1$c)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAEMView Question on Stackoverflow
Solution 1 - RtonytonovView Answer on Stackoverflow
Solution 2 - RSven HohensteinView Answer on Stackoverflow
Solution 3 - RWernerView Answer on Stackoverflow
Solution 4 - RA5C1D2H2I1M1N2O1R2T1View Answer on Stackoverflow