Remove part of string after "."

R

R Problem Overview


I am working with NCBI Reference Sequence accession numbers like variable a:

a <- c("NM_020506.1","NM_020519.1","NM_001030297.2","NM_010281.2","NM_011419.3", "NM_053155.2")  

To get information from the biomart package I need to remove the .1, .2 etc. after the accession numbers. I normally do this with this code:

b <- sub("..*", "", a)
 
# [1] "" "" "" "" "" ""

But as you can see, this isn't the correct way for this variable. Can anyone help me with this?

R Solutions


Solution 1 - R

You just need to escape the period:

a <- c("NM_020506.1","NM_020519.1","NM_001030297.2","NM_010281.2","NM_011419.3", "NM_053155.2")

gsub("\\..*","",a)
[1] "NM_020506"    "NM_020519"    "NM_001030297" "NM_010281"    "NM_011419"    "NM_053155" 

Solution 2 - R

We can pretend they are filenames and remove extensions:

tools::file_path_sans_ext(a)
# [1] "NM_020506"    "NM_020519"    "NM_001030297" "NM_010281"    "NM_011419"    "NM_053155"

Solution 3 - R

You could do:

sub("*\\.[0-9]", "", a)

or

library(stringr)
str_sub(a, start=1, end=-3)

Solution 4 - R

If the string should be of fixed length, then substr from base R can be used. But, we can get the position of the . with regexpr and use that in substr

substr(a, 1, regexpr("\\.", a)-1)
#[1] "NM_020506"    "NM_020519"    "NM_001030297" "NM_010281"    "NM_011419"    "NM_053155"   

Solution 5 - R

We can a lookahead regex to extract the strings before ..

library(stringr)

str_extract(a, ".*(?=\\.)")
[1] "NM_020506"    "NM_020519"    "NM_001030297" "NM_010281"   
[5] "NM_011419"    "NM_053155"   

Solution 6 - R

Another option is to use str_split from stringr:

library(stringr)
str_split(a, "\\.", simplify=T)[,1]
[1] "NM_020506"    "NM_020519"    "NM_001030297" "NM_010281"    "NM_011419"    "NM_053155"   

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLisannView Question on Stackoverflow
Solution 1 - RHansiView Answer on Stackoverflow
Solution 2 - Rzx8754View Answer on Stackoverflow
Solution 3 - RjohannesView Answer on Stackoverflow
Solution 4 - RakrunView Answer on Stackoverflow
Solution 5 - Rbenson23View Answer on Stackoverflow
Solution 6 - Ruser438383View Answer on Stackoverflow