Extract year from date

R

R Problem Overview


How can I remove the first elements from a variable, especially if this variable has a special characters. For example, I have the following column:

Date
01/01/2009
01/01/2010
01/01/2011
01/01/2012

I need to have a new column like the following:

Date
2009
2010
2011
2012

R Solutions


Solution 1 - R

As discussed in the comments, this can be achieved by converting the entry into Date format and extracting the year, for instance like this:

format(as.Date(df1$Date, format="%d/%m/%Y"),"%Y")

Solution 2 - R

Solution 3 - R

When you convert your variable to Date:

date <-  as.Date('10/30/2018','%m/%d/%Y')

you can then cut out the elements you want and make new variables, like year:

year <- as.numeric(format(date,'%Y'))

or month:

month <- as.numeric(format(date,'%m'))

Solution 4 - R

if all your dates are the same width, you can put the dates in a vector and use substring

Date
a <- c("01/01/2009", "01/01/2010" , "01/01/2011")
substring(a,7,10) #This takes string and only keeps the characters beginning in position 7 to position 10

output

[1] "2009" "2010" "2011"

Solution 5 - R

This is more advice than a specific answer, but my suggestion is to convert dates to date variables immediately, rather than keeping them as strings. This way you can use date (and time) functions on them, rather than trying to use very troublesome workarounds.

As pointed out, the lubridate package has nice extraction functions.

For some projects, I have found that piecing dates out from the start is helpful: create year, month, day (of month) and day (of week) variables to start with. This can simplify summaries, tables and graphs, because the extraction code is separate from the summary/table/graph code, and because if you need to change it, you don't have to roll out those changes in multiple spots.

Solution 6 - R

If you are using the date package, this can be done fairly easily.

library(date)
Date <- c("01/01/2009", "01/01/2010", "01/01/2011", "01/01/2012")
Date <- as.date(Date)
Date
# [1] 1Jan2009 1Jan2010 1Jan2011 1Jan2012
date.mdy(Date)$year
# [1] 2009 2010 2011 2012

## be aware that these are now integers and thus different methods may be invoked:
str(date.mdy(Date)$year)
# int [1:4] 2009 2010 2011 2012
summary(Date)
#     First      Last   
# "1Jan2009" "1Jan2012" 
summary(date.mdy(Date)$year)
#    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#    2009    2010    2010    2010    2011    2012 

Solution 7 - R

For some time now, you can also only rely on the data.table package and its IDate class plus associated functions (Check ?as.IDate()). So, no need to additionally install lubridate.

require(data.table)

a <- c("01/01/2009", "01/01/2010" , "01/01/2011")
year(as.IDate(a, '%d/%m/%Y')) # all data.table functions

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionhbtf.1046View Question on Stackoverflow
Solution 1 - RRHertelView Answer on Stackoverflow
Solution 2 - RAjay OhriView Answer on Stackoverflow
Solution 3 - RinvictusView Answer on Stackoverflow
Solution 4 - RAlexanderView Answer on Stackoverflow
Solution 5 - RBarry DeCiccoView Answer on Stackoverflow
Solution 6 - Rgung - Reinstate MonicaView Answer on Stackoverflow
Solution 7 - RandscharView Answer on Stackoverflow