Efficient date range overlap calculation in python?

PythonDateDate Range

Python Problem Overview


I have two date ranges where each range is determined by a start and end date (obviously, datetime.date() instances). The two ranges can overlap or not. I need the number of days of the overlap. Of course I can pre-fill two sets with all dates within both ranges and the perform a set intersection but this is possibly inefficient...is there a better way apart from another solution using a long if-elif section covering all cases ?

Python Solutions


Solution 1 - Python

  • Determine the latest of the two start dates and the earliest of the two end dates.
  • Compute the timedelta by subtracting them.
  • If the delta is positive, that is the number of days of overlap.

Here is an example calculation:

>>> from datetime import datetime
>>> from collections import namedtuple
>>> Range = namedtuple('Range', ['start', 'end'])

>>> r1 = Range(start=datetime(2012, 1, 15), end=datetime(2012, 5, 10))
>>> r2 = Range(start=datetime(2012, 3, 20), end=datetime(2012, 9, 15))
>>> latest_start = max(r1.start, r2.start)
>>> earliest_end = min(r1.end, r2.end)
>>> delta = (earliest_end - latest_start).days + 1
>>> overlap = max(0, delta)
>>> overlap
52

Solution 2 - Python

Function calls are more expensive than arithmetic operations.

The fastest way of doing this involves 2 subtractions and 1 min():

min(r1.end - r2.start, r2.end - r1.start).days + 1

compared with the next best which needs 1 subtraction, 1 min() and a max():

(min(r1.end, r2.end) - max(r1.start, r2.start)).days + 1

Of course with both expressions you still need to check for a positive overlap.

Solution 3 - Python

I implemented a TimeRange class as you can see below.

The get_overlapped_range first negates all the non overlapped options by a simple condition, and then calculate the overlapped range by considering all the possible options.

To get the amount of days you'll need to take the TimeRange value that was returned from get_overlapped_range and divide the duration by 606024.

class TimeRange(object):
    def __init__(self, start, end):
        self.start = start
        self.end = end
        self.duration = self.end - self.start

    def is_overlapped(self, time_range):
        if max(self.start, time_range.start) < min(self.end, time_range.end):
            return True
        else:
            return False

    def get_overlapped_range(self, time_range):
        if not self.is_overlapped(time_range):
            return

        if time_range.start >= self.start:
            if self.end >= time_range.end:
                return TimeRange(time_range.start, time_range.end)
            else:
                return TimeRange(time_range.start, self.end)
        elif time_range.start < self.start:
            if time_range.end >= self.end:
                return TimeRange(self.start, self.end)
            else:
                return TimeRange(self.start, time_range.end)

    def __repr__(self):
        return '{0} ------> {1}'.format(*[time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(d))
                                          for d in [self.start, self.end]])

Solution 4 - Python

You can use the datetimerange package: https://pypi.org/project/DateTimeRange/

from datetimerange import DateTimeRange
time_range1 = DateTimeRange("2015-01-01T00:00:00+0900", "2015-01-04T00:20:00+0900") 
time_range2 = DateTimeRange("2015-01-01T00:00:10+0900", "2015-01-04T00:20:00+0900")
tem3 = time_range1.intersection(time_range2)
if tem3.NOT_A_TIME_STR == 'NaT':  # No overlap
    S_Time = 0
else: # Output the overlap seconds
    S_Time = tem3.timedelta.total_seconds()

"2015-01-01T00:00:00+0900" inside the DateTimeRange() can also be datetime format, like Timestamp('2017-08-30 20:36:25').

Solution 5 - Python

Pseudocode:

 1 + max( -1, min( a.dateEnd, b.dateEnd) - max( a.dateStart, b.dateStart) )

Solution 6 - Python

Building on the solution of @Raymond Hettinger, since python 3.6 you can now use NamedTuple from the typing module.

from datetime import datetime
from typing import NamedTuple

class Range(NamedTuple):
    start: datetime
    end: datetime
>>> r1 = Range(start=datetime(2012, 1, 15), end=datetime(2012, 5, 10))
>>> r2 = Range(start=datetime(2012, 3, 20), end=datetime(2012, 9, 15))
>>> latest_start = max(r1.start, r2.start)
>>> earliest_end = min(r1.end, r2.end)
>>> delta = (earliest_end - latest_start).days + 1
>>> overlap = max(0, delta)
>>> overlap

Solution 7 - Python

def get_overlap(r1,r2):
    latest_start=max(r1[0],r2[0])
    earliest_end=min(r1[1],r2[1])
    delta=(earliest_end-latest_start).days
    if delta>0:
        return delta+1
    else:
        return 0

Solution 8 - Python

Ok my solution is a bit wonky because my df uses all series - but lets say you have the following columns, 2 of which are fixed which is your "Fiscal Year". PoP is "Period of performance" which is your variable data:

df['PoP_Start']
df['PoP_End']
df['FY19_Start'] = '10/1/2018'
df['FY19_End'] = '09/30/2019'

Assume all of the data is in datetime format ie -

df['FY19_Start'] = pd.to_datetime(df['FY19_Start'])
df['FY19_End'] = pd.to_datetime(df['FY19_End'])

Try the following equations to find the number of days overlap:

min1 = np.minimum(df['POP_End'], df['FY19_End'])
max2 = np.maximum(df['POP_Start'], df['FY19_Start'])

df['Overlap_2019'] = (min1 - max2) / np.timedelta64(1, 'D')
df['Overlap_2019'] = np.maximum(df['Overlap_2019']+1,0)

Solution 9 - Python

Another solution would be sorting a source array by ascending first and then looping through and comparing the dates like so:

date_ranges = sorted(
    date_ranges,
    key=lambda item: item['start_date'],
)
for i in range(len(date_ranges)-1):
    if date_ranges[i]['end_date'] > date_ranges[i+1]['start_date']:
        raise Exception('Overlap'})

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAndreas JungView Question on Stackoverflow
Solution 1 - PythonRaymond HettingerView Answer on Stackoverflow
Solution 2 - PythonJohn MachinView Answer on Stackoverflow
Solution 3 - PythonElad SoferView Answer on Stackoverflow
Solution 4 - PythonSonghua HuView Answer on Stackoverflow
Solution 5 - PythonypercubeᵀᴹView Answer on Stackoverflow
Solution 6 - PythonloicgasserView Answer on Stackoverflow
Solution 7 - Pythonandros1337View Answer on Stackoverflow
Solution 8 - PythonArthur D. HowlandView Answer on Stackoverflow
Solution 9 - PythonkozloneView Answer on Stackoverflow