R tick data : merging date and time into a single object

RStatisticsTime SeriesFinance

R Problem Overview


I'm currently working in tick data with R and I would like to merge date and time into a single object as I need to get a precise time object to compute some statistics on my data. Here is how my data looks like:

               date       time      price flag    exchange
2   XXH10   2010-02-02   08:00:03   2787 1824        E
3   XXH10   2010-02-02   08:00:04   2786    3        E
4   XXH10   2010-02-02   08:00:04   2787    6        E
5   XXH10   2010-02-02   08:00:04   2787    1        E
6   XXH10   2010-02-02   08:00:04   2787    1        E

Basically, I would like to merge the columns "date" and "time" into a single one.

R Solutions


Solution 1 - R

Create a datetime object with as.POSIXct:

as.POSIXct(paste(x$date, x$time), format="%Y-%m-%d %H:%M:%S")
[1] "2010-02-02 08:00:03 GMT" "2010-02-02 08:00:04 GMT" "2010-02-02 08:00:04 GMT"
[4] "2010-02-02 08:00:04 GMT" "2010-02-02 08:00:04 GMT"

Solution 2 - R

Of course, more elegant solution (arguably) is possible with extra package. When working with dates it's lubridate package:

library(lubridate)

with(x, ymd(date) + hms(time))

should produce POSIXlt vector.

UPDATE:

There is another solution using general purpose date and time conversion package anytime (based on C++ library Boost date_time):

 library(anytime)

 with(x, anytime(paste(date, time)))

Indeed, comparing anytime with both base R and lubridate (deservedly considered rather slow - see https://stackoverflow.com/q/46046714/59470) C++ (anytime) wins:

 x = read.csv(text = 'date,time
2010-02-02,08:00:03
2010-02-02,08:00:04
2010-02-02,08:00:04
2010-02-03,08:00:04
2010-02-04,08:00:05
2010-02-04,08:00:05
2010-02-04,08:00:06
2010-02-04,08:00:07
2010-02-04,08:00:08
2010-02-04,08:00:14')

 microbenchmark::microbenchmark(
   base = with(x, as.POSIXct(paste(date, time), format="%Y-%m-%d %H:%M:%S")),
   anytime = with(x, anytime::anytime(paste(date, time))),
   lubri = with(x, lubridate::ymd(date) + lubridate::hms(time)),
   times = 1000L
)

> Unit: microseconds > expr min lq mean median uq max neval > base 71.163 91.2555 104.38747 104.785 112.1185 256.997 1000 > anytime 40.508 52.5385 63.46973 61.843 68.5730 221.076 1000 > lubri 1596.490 1850.4400 2235.34254 1909.588 2033.096 110751.622 1000

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionmarino89View Question on Stackoverflow
Solution 1 - RAndrieView Answer on Stackoverflow
Solution 2 - RtopchefView Answer on Stackoverflow