python时间间隔重叠持续时间
python time interval overlap duration
我的问题类似于Efficient date range overlap calculation in python?,但是,我需要用完整的时间戳而不是天来计算重叠,但更重要的是,我不能指定特定日期作为重叠,而只能指定小时。
import pandas as pd
import numpy as np
df = pd.DataFrame({'first_ts': {0: np.datetime64('2020-01-25 07:30:25.435000'),
1: np.datetime64('2020-01-25 07:25:00')},
'last_ts': {0: np.datetime64('2020-01-25 07:30:25.718000'),
1: np.datetime64('2020-01-25 07:25:00')}})
df['start_hour'] = 7
df['start_minute'] = 0
df['end_hour'] = 8
df['end_minute'] = 0
display(df)
如何计算间隔(first_ts、last_ts)与第二个间隔的重叠持续时间(以毫秒为单位)?
潜在地,我需要在每一天构建一个时间戳,时间间隔由小时定义,然后计算重叠。
想法是为开始和结束日期时间创建新系列,日期按日期时间列,使用 numpy.minimum
and numpy.maximum
, subtract, convert timedeltas by Series.dt.total_seconds
并按 1000
:
倍增
s = (df['first_ts'].dt.strftime('%Y-%m-%d ') +
df['start_hour'].astype(str) + ':' +
df['start_minute'].astype(str))
e = (df['last_ts'].dt.strftime('%Y-%m-%d ') +
df['end_hour'].astype(str) + ':' +
df['end_minute'].astype(str))
s = pd.to_datetime(s, format='%Y-%m-%d %H:%M')
e = pd.to_datetime(e, format='%Y-%m-%d %H:%M')
df['inter'] = ((np.minimum(e, df['last_ts']) -
np.maximum(s, df['first_ts'])).dt.total_seconds() * 1000)
print (df)
first_ts last_ts start_hour start_minute \
0 2020-01-25 07:30:25.435 2020-01-25 07:30:25.718 7 0
1 2020-01-25 07:25:00.000 2020-01-25 07:25:00.000 7 0
end_hour end_minute inter
0 8 0 283.0
1 8 0 0.0
另一个想法是只使用 np.minumum
:
df['inter'] = (np.minimum(df['last_ts'] - df['first_ts'], e - s).dt.total_seconds() * 1000)
print (df)
first_ts last_ts start_hour start_minute \
0 2020-01-25 07:30:25.435 2020-01-25 07:30:25.718 7 0
1 2020-01-25 07:25:00.000 2020-01-25 07:25:00.000 7 0
end_hour end_minute inter
0 8 0 283.0
1 8 0 0.0
我的问题类似于Efficient date range overlap calculation in python?,但是,我需要用完整的时间戳而不是天来计算重叠,但更重要的是,我不能指定特定日期作为重叠,而只能指定小时。
import pandas as pd
import numpy as np
df = pd.DataFrame({'first_ts': {0: np.datetime64('2020-01-25 07:30:25.435000'),
1: np.datetime64('2020-01-25 07:25:00')},
'last_ts': {0: np.datetime64('2020-01-25 07:30:25.718000'),
1: np.datetime64('2020-01-25 07:25:00')}})
df['start_hour'] = 7
df['start_minute'] = 0
df['end_hour'] = 8
df['end_minute'] = 0
display(df)
如何计算间隔(first_ts、last_ts)与第二个间隔的重叠持续时间(以毫秒为单位)? 潜在地,我需要在每一天构建一个时间戳,时间间隔由小时定义,然后计算重叠。
想法是为开始和结束日期时间创建新系列,日期按日期时间列,使用 numpy.minimum
and numpy.maximum
, subtract, convert timedeltas by Series.dt.total_seconds
并按 1000
:
s = (df['first_ts'].dt.strftime('%Y-%m-%d ') +
df['start_hour'].astype(str) + ':' +
df['start_minute'].astype(str))
e = (df['last_ts'].dt.strftime('%Y-%m-%d ') +
df['end_hour'].astype(str) + ':' +
df['end_minute'].astype(str))
s = pd.to_datetime(s, format='%Y-%m-%d %H:%M')
e = pd.to_datetime(e, format='%Y-%m-%d %H:%M')
df['inter'] = ((np.minimum(e, df['last_ts']) -
np.maximum(s, df['first_ts'])).dt.total_seconds() * 1000)
print (df)
first_ts last_ts start_hour start_minute \
0 2020-01-25 07:30:25.435 2020-01-25 07:30:25.718 7 0
1 2020-01-25 07:25:00.000 2020-01-25 07:25:00.000 7 0
end_hour end_minute inter
0 8 0 283.0
1 8 0 0.0
另一个想法是只使用 np.minumum
:
df['inter'] = (np.minimum(df['last_ts'] - df['first_ts'], e - s).dt.total_seconds() * 1000)
print (df)
first_ts last_ts start_hour start_minute \
0 2020-01-25 07:30:25.435 2020-01-25 07:30:25.718 7 0
1 2020-01-25 07:25:00.000 2020-01-25 07:25:00.000 7 0
end_hour end_minute inter
0 8 0 283.0
1 8 0 0.0