从具有非常规日期时间的 df 创建具有常规日期时间的新 DataFrame
Creating new DataFrame with regular datetime from a df with non regular datetime
我有一个像这样的 DataFrame
df
dateEntry dataReceived
0 2021-12-22 15:00:34.359293 0
1 2021-12-22 15:00:56.052554 1
2 2021-12-22 15:02:12.408687 0
3 2021-12-22 15:02:18.764644 1
4 2021-12-22 15:03:26.959721 0
5 2021-12-22 15:03:38.039307 1
6 2021-12-22 15:05:59.347346 0
7 2021-12-22 15:06:22.955319 1
dateEntry
是 datetime64[ns] 类型。
dataReceived
总是在0和1之间交替。例如在第一行,这意味着这个人在下一行之前没有移动(标签0),所以这个人没有移动56-34 =22秒
我想创建一个其他数据帧,但具有常规时间步长,例如它从 2021_12_22 15:00:40
开始,时间步长为 15 秒。
要在新的DataFrame中赋值,我认为新的datetime取其所在区间的下限值:
期望的输出
df_new
dateEntry dataReceived
0 2021-12-22 15:00:40 0
1 2021-12-22 15:00:55 0
2 2021-12-22 15:01:10 1
3 2021-12-22 15:01:25 1
4 2021-12-22 15:01:40 1
...
2021-12-22 15:05:55 1
2021-12-22 15:06:10 0
如何获取?
IIUC,你需要resample
:
df['dateEntry'] = pd.to_datetime(df['dateEntry'])
df2 = (df.set_index('dateEntry')
.resample('15s', origin='2021-12-22 15:00:40', closed='right')
.ffill()
.reset_index()
)
输出:
dateEntry dataReceived
0 2021-12-22 15:00:40 0
1 2021-12-22 15:00:55 0
2 2021-12-22 15:01:10 1
3 2021-12-22 15:01:25 1
4 2021-12-22 15:01:40 1
5 2021-12-22 15:01:55 1
...
20 2021-12-22 15:05:40 1
21 2021-12-22 15:05:55 1
22 2021-12-22 15:06:10 0
23 2021-12-22 15:06:25 1
我有一个像这样的 DataFrame
df
dateEntry dataReceived
0 2021-12-22 15:00:34.359293 0
1 2021-12-22 15:00:56.052554 1
2 2021-12-22 15:02:12.408687 0
3 2021-12-22 15:02:18.764644 1
4 2021-12-22 15:03:26.959721 0
5 2021-12-22 15:03:38.039307 1
6 2021-12-22 15:05:59.347346 0
7 2021-12-22 15:06:22.955319 1
dateEntry
是 datetime64[ns] 类型。
dataReceived
总是在0和1之间交替。例如在第一行,这意味着这个人在下一行之前没有移动(标签0),所以这个人没有移动56-34 =22秒
我想创建一个其他数据帧,但具有常规时间步长,例如它从 2021_12_22 15:00:40
开始,时间步长为 15 秒。
要在新的DataFrame中赋值,我认为新的datetime取其所在区间的下限值:
期望的输出
df_new
dateEntry dataReceived
0 2021-12-22 15:00:40 0
1 2021-12-22 15:00:55 0
2 2021-12-22 15:01:10 1
3 2021-12-22 15:01:25 1
4 2021-12-22 15:01:40 1
...
2021-12-22 15:05:55 1
2021-12-22 15:06:10 0
如何获取?
IIUC,你需要resample
:
df['dateEntry'] = pd.to_datetime(df['dateEntry'])
df2 = (df.set_index('dateEntry')
.resample('15s', origin='2021-12-22 15:00:40', closed='right')
.ffill()
.reset_index()
)
输出:
dateEntry dataReceived
0 2021-12-22 15:00:40 0
1 2021-12-22 15:00:55 0
2 2021-12-22 15:01:10 1
3 2021-12-22 15:01:25 1
4 2021-12-22 15:01:40 1
5 2021-12-22 15:01:55 1
...
20 2021-12-22 15:05:40 1
21 2021-12-22 15:05:55 1
22 2021-12-22 15:06:10 0
23 2021-12-22 15:06:25 1