Pandas date_range to generate monthly data at beginning of the month

PythonPandasDate Range

Python Problem Overview


I'm trying to generate a date range of monthly data where the day is always at the beginning of the month:

pd.date_range(start='1/1/1980', end='11/1/1991', freq='M')

This generates 1/31/1980, 2/29/1980, and so on. Instead, I just want 1/1/1980, 2/1/1980,...

I've seen other question ask about generating data that is always on a specific day of the month, with answers saying it wasn't possible, but beginning of month surely must be possible!

Python Solutions


Solution 1 - Python

You can do this by changing the freq argument from 'M' to 'MS':

d = pandas.date_range(start='1/1/1980', end='11/1/1990', freq='MS')    
print(d)

This should now print:

DatetimeIndex(['1980-01-01', '1980-02-01', '1980-03-01', '1980-04-01',               '1980-05-01', '1980-06-01', '1980-07-01', '1980-08-01',               '1980-09-01', '1980-10-01',                ...               '1990-02-01', '1990-03-01', '1990-04-01', '1990-05-01',               '1990-06-01', '1990-07-01', '1990-08-01', '1990-09-01',               '1990-10-01', '1990-11-01'],
              dtype='datetime64[ns]', length=131, freq='MS', tz=None)

Look into the offset aliases part of the documentation. There it states that 'M' is for the end of the month (month end frequency) while 'MS' for the beginning (month start frequency).

Solution 2 - Python

It is worth noting that the 'MS' option of pandas.date_range() suggested by Dimitris makes the range start at the beginning of the next month, which may not be expected :

start = "2020-03-08"
end = "2021-03-08"
pd.date_range(start, end, freq='MS')

results in

DatetimeIndex(['2020-04-01', '2020-05-01', '2020-06-01', '2020-07-01',
           '2020-08-01', '2020-09-01', '2020-10-01', '2020-11-01',
           '2020-12-01', '2021-01-01', '2021-02-01', '2021-03-01'],
          dtype='datetime64[ns]', freq='MS')

A workaround is to work only with the year and month of the start date :

pd.date_range(start[:7], end, freq='MS')

will then give

DatetimeIndex(['2020-03-01', '2020-04-01', '2020-05-01', '2020-06-01',
           '2020-07-01', '2020-08-01', '2020-09-01', '2020-10-01',
           '2020-11-01', '2020-12-01', '2021-01-01', '2021-02-01',
           '2021-03-01'],
          dtype='datetime64[ns]', freq='MS')

If you wish to keep the same starting day for each month, you can then add the offset with pd.DateOffset() :

pd.date_range(start[:7], end, freq='MS') + pd.DateOffset(days=7)

will give

DatetimeIndex(['2020-03-08', '2020-04-08', '2020-05-08', '2020-06-08',           '2020-07-08', '2020-08-08', '2020-09-08', '2020-10-08',           '2020-11-08', '2020-12-08', '2021-01-08', '2021-02-08',           '2021-03-08'],
          dtype='datetime64[ns]', freq=None)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionBunny_RossView Question on Stackoverflow
Solution 1 - PythonDimitris Fasarakis HilliardView Answer on Stackoverflow
Solution 2 - PythonSkippy le Grand GourouView Answer on Stackoverflow