How do I properly set the Datetimeindex for a Pandas datetime object in a dataframe?

PythonDatetimePandas

Python Problem Overview


I have a pandas dataframe:

	lat	        lng	        alt	days	          date	      time
0	40.003834	116.321462	211	39745.175405	  2008-10-24  04:12:35
1	40.003783	116.321431	201	39745.175463  2008-10-24	  04:12:40
2	40.003690	116.321429	203	39745.175521	  2008-10-24	  04:12:45
3	40.003589	116.321427	194	39745.175579	  2008-10-24	  04:12:50
4	40.003522	116.321412	190	39745.175637	  2008-10-24	  04:12:55
5	40.003509	116.321484	188	39745.175694	  2008-10-24	  04:13:00

For which I am trying to convert the df['date'] and df['time'] columns into a datetime. I can do:

df['Datetime'] = pd.to_datetime(df['date']+df['time'])
df = df.set_index(['Datetime'])
del df['date']
del df['time']

And I get:

	                lat	        lng	        alt	days
Datetime							
2008-10-2404:12:35	40.003834	116.321462	211	39745.175405	
2008-10-2404:12:40	40.003783	116.321431	201	39745.175463
2008-10-2404:12:45	40.003690	116.321429	203	39745.175521	
2008-10-2404:12:50	40.003589	116.321427	194	39745.175579	
2008-10-2404:12:55	40.003522	116.321412	190	39745.175637

But then if I try:

df.between_time(time(1),time(22,59,59))['lng'].std()

I get an error - 'TypeError: Index must be DatetimeIndex'

So, I've also tried setting the DatetimeIndex:

df['Datetime'] = pd.to_datetime(df['date']+df['time'])
#df = df.set_index(['Datetime'])
df = df.set_index(pd.DatetimeIndex(df['Datetime']))
del df['date']
del df['time']

And this throws an error also - 'DateParseError: unknown string format'

How do I create the datetime column and DatetimeIndex correctly so that df.between_time() works right?

Python Solutions


Solution 1 - Python

To simplify Kirubaharan's answer a bit:

df['Datetime'] = pd.to_datetime(df['date'] + ' ' + df['time'])
df = df.set_index('Datetime')

And to get rid of unwanted columns (as OP did but did not specify per se in the question):

df = df.drop(['date','time'], axis=1)

Solution 2 - Python

You are not creating datetime index properly,

format = '%Y-%m-%d %H:%M:%S'
df['Datetime'] = pd.to_datetime(df['date'] + ' ' + df['time'], format=format)
df = df.set_index(pd.DatetimeIndex(df['Datetime']))

Solution 3 - Python

This worked best for me:

format = '%Y-%m-%d%H:%M:%S'
df['Datetime'] = pd.to_datetime(df['date'] + df['time'].astype("string"), format=format)

In some cases Python treats df['date'] as column of integers.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser3654387View Question on Stackoverflow
Solution 1 - PythonKracitView Answer on Stackoverflow
Solution 2 - PythonKirubaharan JView Answer on Stackoverflow
Solution 3 - PythonUnreal QwView Answer on Stackoverflow