R tick data : merging date and time into a single object
RStatisticsTime SeriesFinanceR Problem Overview
I'm currently working in tick data with R and I would like to merge date and time into a single object as I need to get a precise time object to compute some statistics on my data. Here is how my data looks like:
date time price flag exchange
2 XXH10 2010-02-02 08:00:03 2787 1824 E
3 XXH10 2010-02-02 08:00:04 2786 3 E
4 XXH10 2010-02-02 08:00:04 2787 6 E
5 XXH10 2010-02-02 08:00:04 2787 1 E
6 XXH10 2010-02-02 08:00:04 2787 1 E
Basically, I would like to merge the columns "date" and "time" into a single one.
R Solutions
Solution 1 - R
Create a datetime
object with as.POSIXct
:
as.POSIXct(paste(x$date, x$time), format="%Y-%m-%d %H:%M:%S")
[1] "2010-02-02 08:00:03 GMT" "2010-02-02 08:00:04 GMT" "2010-02-02 08:00:04 GMT"
[4] "2010-02-02 08:00:04 GMT" "2010-02-02 08:00:04 GMT"
Solution 2 - R
Of course, more elegant solution (arguably) is possible with extra package. When working with dates it's lubridate package:
library(lubridate)
with(x, ymd(date) + hms(time))
should produce POSIXlt vector.
UPDATE:
There is another solution using general purpose date and time conversion package anytime
(based on C++ library Boost date_time):
library(anytime)
with(x, anytime(paste(date, time)))
Indeed, comparing anytime
with both base R and lubridate
(deservedly considered rather slow - see https://stackoverflow.com/q/46046714/59470) C++ (anytime
) wins:
x = read.csv(text = 'date,time
2010-02-02,08:00:03
2010-02-02,08:00:04
2010-02-02,08:00:04
2010-02-03,08:00:04
2010-02-04,08:00:05
2010-02-04,08:00:05
2010-02-04,08:00:06
2010-02-04,08:00:07
2010-02-04,08:00:08
2010-02-04,08:00:14')
microbenchmark::microbenchmark(
base = with(x, as.POSIXct(paste(date, time), format="%Y-%m-%d %H:%M:%S")),
anytime = with(x, anytime::anytime(paste(date, time))),
lubri = with(x, lubridate::ymd(date) + lubridate::hms(time)),
times = 1000L
)
> Unit: microseconds > expr min lq mean median uq max neval > base 71.163 91.2555 104.38747 104.785 112.1185 256.997 1000 > anytime 40.508 52.5385 63.46973 61.843 68.5730 221.076 1000 > lubri 1596.490 1850.4400 2235.34254 1909.588 2033.096 110751.622 1000