python时间间隔重叠持续时间

python time interval overlap duration

我的问题类似于Efficient date range overlap calculation in python?,但是,我需要用完整的时间戳而不是天来计算重叠,但更重要的是,我不能指定特定日期作为重叠,而只能指定小时。

import pandas as pd
import numpy as np

df = pd.DataFrame({'first_ts': {0: np.datetime64('2020-01-25 07:30:25.435000'),
  1: np.datetime64('2020-01-25 07:25:00')},
 'last_ts': {0: np.datetime64('2020-01-25 07:30:25.718000'),
  1: np.datetime64('2020-01-25 07:25:00')}})
df['start_hour'] = 7
df['start_minute'] = 0
df['end_hour'] = 8
df['end_minute'] = 0
display(df)

如何计算间隔(first_ts、last_ts)与第二个间隔的重叠持续时间(以毫秒为单位)? 潜在地,我需要在每一天构建一个时间戳,时间间隔由小时定义,然后计算重叠。

想法是为开始和结束日期时间创建新系列,日期按日期时间列,使用 numpy.minimum and numpy.maximum, subtract, convert timedeltas by Series.dt.total_seconds 并按 1000:

倍增
s = (df['first_ts'].dt.strftime('%Y-%m-%d ') + 
     df['start_hour'].astype(str) + ':' + 
     df['start_minute'].astype(str))
e = (df['last_ts'].dt.strftime('%Y-%m-%d ') + 
     df['end_hour'].astype(str) + ':' +
     df['end_minute'].astype(str))

s = pd.to_datetime(s, format='%Y-%m-%d %H:%M')
e = pd.to_datetime(e, format='%Y-%m-%d %H:%M')

df['inter'] = ((np.minimum(e, df['last_ts']) - 
                np.maximum(s, df['first_ts'])).dt.total_seconds() * 1000)
print (df)
                 first_ts                 last_ts  start_hour  start_minute  \
0 2020-01-25 07:30:25.435 2020-01-25 07:30:25.718           7             0   
1 2020-01-25 07:25:00.000 2020-01-25 07:25:00.000           7             0   

   end_hour  end_minute  inter  
0         8           0  283.0  
1         8           0    0.0  

另一个想法是只使用 np.minumum:

df['inter'] = (np.minimum(df['last_ts'] - df['first_ts'], e - s).dt.total_seconds() * 1000)
print (df)
                 first_ts                 last_ts  start_hour  start_minute  \
0 2020-01-25 07:30:25.435 2020-01-25 07:30:25.718           7             0   
1 2020-01-25 07:25:00.000 2020-01-25 07:25:00.000           7             0   

   end_hour  end_minute  inter  
0         8           0  283.0  
1         8           0    0.0