在 pandas 日期时间列中查找值之间的中点,并根据中点制作开始和结束时间段列
Finding the midpoint between values in a pandas datetime column and making a start and end time period column based on the midpoint
这是我的代码设置:
import pandas as pd
df = {'Datetime': ['2020-12-01 00:00:00', '2020-12-01 01:00:00','2020-12-01 02:00:00',
'2020-12-01 03:00:00', '2020-12-01 04:00:00' , '2020-12-01 05:00:00' ,
'2020-12-01 06:00:00' , '2020-12-01 09:00:00' , '2020-12-01 12:00:00' ,
'2020-12-01 18:00:00' , '2020-12-02 00:00:00'
]
}
df = pd.DataFrame(df , columns = ['Datetime'])
df["Datetime"] = pd.to_datetime(df['Datetime'])
df
生成具有以下形式的数据框:
Datetime
0 2020-12-01 00:00:00
1 2020-12-01 01:00:00
2 2020-12-01 02:00:00
3 2020-12-01 03:00:00
4 2020-12-01 04:00:00
5 2020-12-01 05:00:00
6 2020-12-01 06:00:00
7 2020-12-01 09:00:00
8 2020-12-01 12:00:00
9 2020-12-01 18:00:00
10 2020-12-02 00:00:00
我想要做的是找到值之间的中点并在数据框中创建两个新列。两个新列是“开始时间”和“结束时间”。 “开始时间”是该时间与上一时间(如果存在)之间的中点。 “结束时间”是那个时间和下一个时间(如果存在的话)之间的中点。如果不存在则使用当前时间。
这是我希望代码生成的内容:
Datetime Start Time End Time
0 2020-12-01 00:00:00 2020-12-01 00:00:00 2020-12-01 00:30:00
1 2020-12-01 01:00:00 2020-12-01 00:30:00 2020-12-01 01:30:00
2 2020-12-01 02:00:00 2020-12-01 01:30:00 2020-12-01 02:30:00
3 2020-12-01 03:00:00 2020-12-01 02:30:00 2020-12-01 03:30:00
4 2020-12-01 04:00:00 2020-12-01 03:30:00 2020-12-01 04:30:00
5 2020-12-01 05:00:00 2020-12-01 04:30:00 2020-12-01 05:30:00
6 2020-12-01 06:00:00 2020-12-01 05:30:00 2020-12-01 07:30:00
7 2020-12-01 09:00:00 2020-12-01 07:30:00 2020-12-01 10:30:00
8 2020-12-01 12:00:00 2020-12-01 10:30:00 2020-12-01 15:00:00
9 2020-12-01 18:00:00 2020-12-01 15:00:00 2020-12-01 21:00:00
10 2020-12-02 00:00:00 2020-12-02 21:00:00 2020-12-02 00:00:00
如果您能帮助解决这个问题,我们将不胜感激。
可以用shift
计算中点得到连续行的时间差,除以2得到Start Time
。然后,只需 shift(-1)
一行得到 End Time
:
df['Start Time'] = (df['Datetime'] + (df['Datetime'].shift(1) - df['Datetime']) / 2).fillna(df['Datetime'])
df['End Time'] = (df['Start Time'].shift(-1)).fillna(df['Datetime'])
df
Out[1]:
Datetime Start Time End Time
0 2020-12-01 00:00:00 2020-12-01 00:00:00 2020-12-01 00:30:00
1 2020-12-01 01:00:00 2020-12-01 00:30:00 2020-12-01 01:30:00
2 2020-12-01 02:00:00 2020-12-01 01:30:00 2020-12-01 02:30:00
3 2020-12-01 03:00:00 2020-12-01 02:30:00 2020-12-01 03:30:00
4 2020-12-01 04:00:00 2020-12-01 03:30:00 2020-12-01 04:30:00
5 2020-12-01 05:00:00 2020-12-01 04:30:00 2020-12-01 05:30:00
6 2020-12-01 06:00:00 2020-12-01 05:30:00 2020-12-01 07:30:00
7 2020-12-01 09:00:00 2020-12-01 07:30:00 2020-12-01 10:30:00
8 2020-12-01 12:00:00 2020-12-01 10:30:00 2020-12-01 15:00:00
9 2020-12-01 18:00:00 2020-12-01 15:00:00 2020-12-01 21:00:00
10 2020-12-02 00:00:00 2020-12-01 21:00:00 2020-12-02 00:00:00
这是我的代码设置:
import pandas as pd
df = {'Datetime': ['2020-12-01 00:00:00', '2020-12-01 01:00:00','2020-12-01 02:00:00',
'2020-12-01 03:00:00', '2020-12-01 04:00:00' , '2020-12-01 05:00:00' ,
'2020-12-01 06:00:00' , '2020-12-01 09:00:00' , '2020-12-01 12:00:00' ,
'2020-12-01 18:00:00' , '2020-12-02 00:00:00'
]
}
df = pd.DataFrame(df , columns = ['Datetime'])
df["Datetime"] = pd.to_datetime(df['Datetime'])
df
生成具有以下形式的数据框:
Datetime
0 2020-12-01 00:00:00
1 2020-12-01 01:00:00
2 2020-12-01 02:00:00
3 2020-12-01 03:00:00
4 2020-12-01 04:00:00
5 2020-12-01 05:00:00
6 2020-12-01 06:00:00
7 2020-12-01 09:00:00
8 2020-12-01 12:00:00
9 2020-12-01 18:00:00
10 2020-12-02 00:00:00
我想要做的是找到值之间的中点并在数据框中创建两个新列。两个新列是“开始时间”和“结束时间”。 “开始时间”是该时间与上一时间(如果存在)之间的中点。 “结束时间”是那个时间和下一个时间(如果存在的话)之间的中点。如果不存在则使用当前时间。
这是我希望代码生成的内容:
Datetime Start Time End Time
0 2020-12-01 00:00:00 2020-12-01 00:00:00 2020-12-01 00:30:00
1 2020-12-01 01:00:00 2020-12-01 00:30:00 2020-12-01 01:30:00
2 2020-12-01 02:00:00 2020-12-01 01:30:00 2020-12-01 02:30:00
3 2020-12-01 03:00:00 2020-12-01 02:30:00 2020-12-01 03:30:00
4 2020-12-01 04:00:00 2020-12-01 03:30:00 2020-12-01 04:30:00
5 2020-12-01 05:00:00 2020-12-01 04:30:00 2020-12-01 05:30:00
6 2020-12-01 06:00:00 2020-12-01 05:30:00 2020-12-01 07:30:00
7 2020-12-01 09:00:00 2020-12-01 07:30:00 2020-12-01 10:30:00
8 2020-12-01 12:00:00 2020-12-01 10:30:00 2020-12-01 15:00:00
9 2020-12-01 18:00:00 2020-12-01 15:00:00 2020-12-01 21:00:00
10 2020-12-02 00:00:00 2020-12-02 21:00:00 2020-12-02 00:00:00
如果您能帮助解决这个问题,我们将不胜感激。
可以用shift
计算中点得到连续行的时间差,除以2得到Start Time
。然后,只需 shift(-1)
一行得到 End Time
:
df['Start Time'] = (df['Datetime'] + (df['Datetime'].shift(1) - df['Datetime']) / 2).fillna(df['Datetime'])
df['End Time'] = (df['Start Time'].shift(-1)).fillna(df['Datetime'])
df
Out[1]:
Datetime Start Time End Time
0 2020-12-01 00:00:00 2020-12-01 00:00:00 2020-12-01 00:30:00
1 2020-12-01 01:00:00 2020-12-01 00:30:00 2020-12-01 01:30:00
2 2020-12-01 02:00:00 2020-12-01 01:30:00 2020-12-01 02:30:00
3 2020-12-01 03:00:00 2020-12-01 02:30:00 2020-12-01 03:30:00
4 2020-12-01 04:00:00 2020-12-01 03:30:00 2020-12-01 04:30:00
5 2020-12-01 05:00:00 2020-12-01 04:30:00 2020-12-01 05:30:00
6 2020-12-01 06:00:00 2020-12-01 05:30:00 2020-12-01 07:30:00
7 2020-12-01 09:00:00 2020-12-01 07:30:00 2020-12-01 10:30:00
8 2020-12-01 12:00:00 2020-12-01 10:30:00 2020-12-01 15:00:00
9 2020-12-01 18:00:00 2020-12-01 15:00:00 2020-12-01 21:00:00
10 2020-12-02 00:00:00 2020-12-01 21:00:00 2020-12-02 00:00:00