如何从 pandas 数据帧中的开始时间和结束时间获取时隙
How to get the timeslots from the start time and end time in the pandas dataframe
我有一个 pandas 数据框,其中包含 start_time、end_time 和预订时长。
请在下面找到示例数据框
id
Start_time
End_time
Duration
1
2013-11-20 09:00:00
2013-11-20 09:30:00
0 days 0:30:00
2
2013-11-20 09:00:00
2013-11-20 12:10:00
0 days 3:10:00
3
2013-11-20 10:00:00
2013-11-20 11:00:00
0 days 1:00:00
4
2013-11-20 08:00:00
2013-11-20 09:40:00
0 days 1:40:00
我正在尝试从上面的数据帧中获取时隙
预期输出:
id
Start_time
End_time
Duration
Timeslots
1
2013-11-20 09:00:00
2013-11-20 09:30:00
0 days 0:30:00
9 - 10
2
2013-11-20 09:00:00
2013-11-20 12:10:00
0 days 3:10:00
9-10
2
2013-11-20 09:00:00
2013-11-20 12:10:00
0 days 3:10:00
10-11
2
2013-11-20 09:00:00
2013-11-20 12:10:00
0 days 3:10:00
11-12
3
2013-11-20 10:00:00
2013-11-20 11:00:00
0 days 1:00:00
10 - 11
4
2013-11-20 08:00:00
2013-11-20 09:40:00
0 days 1:40:00
8-9
4
2013-11-20 08:00:00
2013-11-20 09:40:00
0 days 1:40:00
9-10
到目前为止我尝试了什么
我可以从 start_time 和 end_time 获取插槽,但我缺少预期的输出
id
Start_time
End_time
Duration
TimeSlot
1
2013-11-20 09:00:00
2013-11-20 09:30:00
0 days 0:30:00
9-9:30
2
2013-11-20 09:00:00
2013-11-20 12:10:00
0 days 3:10:00
9-12:10
3
2013-11-20 10:00:00
2013-11-20 11:00:00
0 days 1:00:00
10-11
4
2013-11-20 08:00:00
2013-11-20 09:40:00
0 days 1:40:00
8 - 9:40
谁能给点提示
尝试:
def get_slots(row):
dti = pd.date_range(row['Start_time'].floor('H'),
row['End_time'].ceil('H'), freq='H')
return [f"{s.hour:02}-{e.hour:02}" for s, e in zip(dti, dti[1:])]
out = df.assign(Timeslots=df.apply(get_slots, axis=1)).explode('Timeslots')
print(out)
# Output:
id Start_time End_time Duration Timeslots
0 1 2013-11-20 09:00:00 2013-11-20 09:30:00 0 days 00:30:00 09-10
1 2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00 09-10
1 2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00 10-11
1 2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00 11-12
1 2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00 12-13
2 3 2013-11-20 10:00:00 2013-11-20 11:00:00 0 days 01:00:00 10-11
3 4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 01:40:00 08-09
3 4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 01:40:00 09-10
可重现的设置:
import pandas as pd
from pandas import Timestamp, Timedelta
data = {
'id': [1, 2, 3, 4],
'Start_time': [Timestamp('2013-11-20 09:00:00'), Timestamp('2013-11-20 09:00:00'),
Timestamp('2013-11-20 10:00:00'), Timestamp('2013-11-20 08:00:00')],
'End_time': [Timestamp('2013-11-20 09:30:00'), Timestamp('2013-11-20 12:10:00'),
Timestamp('2013-11-20 11:00:00'), Timestamp('2013-11-20 09:40:00')],
'Duration': [Timedelta('0 days 00:30:00'), Timedelta('0 days 03:10:00'),
Timedelta('0 days 01:00:00'), Timedelta('0 days 01:40:00')]
}
df = pd.DataFrame(data)
我有一个 pandas 数据框,其中包含 start_time、end_time 和预订时长。
请在下面找到示例数据框
id | Start_time | End_time | Duration |
---|---|---|---|
1 | 2013-11-20 09:00:00 | 2013-11-20 09:30:00 | 0 days 0:30:00 |
2 | 2013-11-20 09:00:00 | 2013-11-20 12:10:00 | 0 days 3:10:00 |
3 | 2013-11-20 10:00:00 | 2013-11-20 11:00:00 | 0 days 1:00:00 |
4 | 2013-11-20 08:00:00 | 2013-11-20 09:40:00 | 0 days 1:40:00 |
我正在尝试从上面的数据帧中获取时隙
预期输出:
id | Start_time | End_time | Duration | Timeslots |
---|---|---|---|---|
1 | 2013-11-20 09:00:00 | 2013-11-20 09:30:00 | 0 days 0:30:00 | 9 - 10 |
2 | 2013-11-20 09:00:00 | 2013-11-20 12:10:00 | 0 days 3:10:00 | 9-10 |
2 | 2013-11-20 09:00:00 | 2013-11-20 12:10:00 | 0 days 3:10:00 | 10-11 |
2 | 2013-11-20 09:00:00 | 2013-11-20 12:10:00 | 0 days 3:10:00 | 11-12 |
3 | 2013-11-20 10:00:00 | 2013-11-20 11:00:00 | 0 days 1:00:00 | 10 - 11 |
4 | 2013-11-20 08:00:00 | 2013-11-20 09:40:00 | 0 days 1:40:00 | 8-9 |
4 | 2013-11-20 08:00:00 | 2013-11-20 09:40:00 | 0 days 1:40:00 | 9-10 |
到目前为止我尝试了什么
我可以从 start_time 和 end_time 获取插槽,但我缺少预期的输出
id | Start_time | End_time | Duration | TimeSlot |
---|---|---|---|---|
1 | 2013-11-20 09:00:00 | 2013-11-20 09:30:00 | 0 days 0:30:00 | 9-9:30 |
2 | 2013-11-20 09:00:00 | 2013-11-20 12:10:00 | 0 days 3:10:00 | 9-12:10 |
3 | 2013-11-20 10:00:00 | 2013-11-20 11:00:00 | 0 days 1:00:00 | 10-11 |
4 | 2013-11-20 08:00:00 | 2013-11-20 09:40:00 | 0 days 1:40:00 | 8 - 9:40 |
谁能给点提示
尝试:
def get_slots(row):
dti = pd.date_range(row['Start_time'].floor('H'),
row['End_time'].ceil('H'), freq='H')
return [f"{s.hour:02}-{e.hour:02}" for s, e in zip(dti, dti[1:])]
out = df.assign(Timeslots=df.apply(get_slots, axis=1)).explode('Timeslots')
print(out)
# Output:
id Start_time End_time Duration Timeslots
0 1 2013-11-20 09:00:00 2013-11-20 09:30:00 0 days 00:30:00 09-10
1 2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00 09-10
1 2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00 10-11
1 2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00 11-12
1 2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00 12-13
2 3 2013-11-20 10:00:00 2013-11-20 11:00:00 0 days 01:00:00 10-11
3 4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 01:40:00 08-09
3 4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 01:40:00 09-10
可重现的设置:
import pandas as pd
from pandas import Timestamp, Timedelta
data = {
'id': [1, 2, 3, 4],
'Start_time': [Timestamp('2013-11-20 09:00:00'), Timestamp('2013-11-20 09:00:00'),
Timestamp('2013-11-20 10:00:00'), Timestamp('2013-11-20 08:00:00')],
'End_time': [Timestamp('2013-11-20 09:30:00'), Timestamp('2013-11-20 12:10:00'),
Timestamp('2013-11-20 11:00:00'), Timestamp('2013-11-20 09:40:00')],
'Duration': [Timedelta('0 days 00:30:00'), Timedelta('0 days 03:10:00'),
Timedelta('0 days 01:00:00'), Timedelta('0 days 01:40:00')]
}
df = pd.DataFrame(data)