如何插值然后在新的一天重新开始?
How to interpolate and then start again on new day?
继我之前的问题 之后,我将如何做同样的事情,但在每天结束时停止。我试过分组,但这似乎删除了很多数据。
这是我开始的数据:
time = np.array([pd.to_datetime("2022-01-01 00:00:00"),pd.to_datetime("2022-01-01 00:00:01"),pd.to_datetime("2022-01-01 00:00:03"), pd.to_datetime("2022-01-01 00:00:04"),pd.to_datetime("2022-01-02 00:00:07"),pd.to_datetime("2022-01-02 00:00:09"), pd.to_datetime("2022-01-02 00:00:10")])
lat = [58.1, 58.4, 58.5, 58.9, 52,52.2, 52.5]
lng = [1.34, 1.44, 1.46, 1.48, 1.35, 1.37, 1.39]
df = pd.DataFrame({"time": time, "lat": lat, "lng" :lng})
time lat lng
2022-01-01 00:00:00 58.1 1.34
2022-01-01 00:00:01 58.4 1.44
2022-01-01 00:00:03 58.5 1.46
2022-01-01 00:00:04 58.9 1.48
2022-01-02 00:00:07 52.0 1.35
2022-01-02 00:00:09 52.2 1.37
2022-01-02 00:00:10 52.5 1.39
预期输出为:
time lat lng
2022-01-01 00:00:00 58.1 1.34
2022-01-01 00:00:01 58.4 1.44
2022-01-01 00:00:01 58.45 1.45
2022-01-01 00:00:03 58.5 1.46
2022-01-01 00:00:04 58.9 1.48
2022-01-02 00:00:07 52.0 1.35
2022-01-02 00:00:08 52.1 1.36
2022-01-02 00:00:09 52.2 1.37
2022-01-02 00:00:10 52.5 1.39
使用这个:
df = df.set_index('time').asfreq(freq='S').interpolate()
当我的所有数据都来自同一天时,效果很好。我怎样才能让它在第二天重置?
可以groupby
and use a custom function with apply
到运行相关的插值逻辑:
def func(x):
return x.set_index('time').asfreq(freq='S').interpolate().reset_index()
df.groupby(df['time'].dt.day).apply(func).reset_index(drop=True)
结果:
time lat lng
0 2022-01-01 00:00:00 58.10 1.34
1 2022-01-01 00:00:01 58.40 1.44
2 2022-01-01 00:00:02 58.45 1.45
3 2022-01-01 00:00:03 58.50 1.46
4 2022-01-01 00:00:04 58.90 1.48
5 2022-01-02 00:00:07 52.00 1.35
6 2022-01-02 00:00:08 52.10 1.36
7 2022-01-02 00:00:09 52.20 1.37
8 2022-01-02 00:00:10 52.50 1.39
继我之前的问题
time = np.array([pd.to_datetime("2022-01-01 00:00:00"),pd.to_datetime("2022-01-01 00:00:01"),pd.to_datetime("2022-01-01 00:00:03"), pd.to_datetime("2022-01-01 00:00:04"),pd.to_datetime("2022-01-02 00:00:07"),pd.to_datetime("2022-01-02 00:00:09"), pd.to_datetime("2022-01-02 00:00:10")])
lat = [58.1, 58.4, 58.5, 58.9, 52,52.2, 52.5]
lng = [1.34, 1.44, 1.46, 1.48, 1.35, 1.37, 1.39]
df = pd.DataFrame({"time": time, "lat": lat, "lng" :lng})
time lat lng
2022-01-01 00:00:00 58.1 1.34
2022-01-01 00:00:01 58.4 1.44
2022-01-01 00:00:03 58.5 1.46
2022-01-01 00:00:04 58.9 1.48
2022-01-02 00:00:07 52.0 1.35
2022-01-02 00:00:09 52.2 1.37
2022-01-02 00:00:10 52.5 1.39
预期输出为:
time lat lng
2022-01-01 00:00:00 58.1 1.34
2022-01-01 00:00:01 58.4 1.44
2022-01-01 00:00:01 58.45 1.45
2022-01-01 00:00:03 58.5 1.46
2022-01-01 00:00:04 58.9 1.48
2022-01-02 00:00:07 52.0 1.35
2022-01-02 00:00:08 52.1 1.36
2022-01-02 00:00:09 52.2 1.37
2022-01-02 00:00:10 52.5 1.39
使用这个:
df = df.set_index('time').asfreq(freq='S').interpolate()
当我的所有数据都来自同一天时,效果很好。我怎样才能让它在第二天重置?
可以groupby
and use a custom function with apply
到运行相关的插值逻辑:
def func(x):
return x.set_index('time').asfreq(freq='S').interpolate().reset_index()
df.groupby(df['time'].dt.day).apply(func).reset_index(drop=True)
结果:
time lat lng
0 2022-01-01 00:00:00 58.10 1.34
1 2022-01-01 00:00:01 58.40 1.44
2 2022-01-01 00:00:02 58.45 1.45
3 2022-01-01 00:00:03 58.50 1.46
4 2022-01-01 00:00:04 58.90 1.48
5 2022-01-02 00:00:07 52.00 1.35
6 2022-01-02 00:00:08 52.10 1.36
7 2022-01-02 00:00:09 52.20 1.37
8 2022-01-02 00:00:10 52.50 1.39