如何根据 Python 中的索引时间序列条件将新数据集附加到现有数据集

Question

我真的是 python 的新手。请任何人帮助我解决如何根据索引时间序列条件将新数据集附加到现有数据集的问题。我需要根据公差 <5min

的时间将 df2 中的每一行添加到 df1

这是我的数据示例

df1

Time	A
01/9/2021 06:50	1
01/9/2021 06:55	2
01/9/2021 07:00	3
01/9/2021 07:05	6
01/9/2021 07:10	3
01/9/2021 07:15	2
01/9/2021 07:20	1
01/9/2021 07:25	2

df2

Time	B
01/9/2021 06:51	0.6
01/9/2021 06:55	0.2
01/9/2021 07:12	0.3
01/9/2021 07:16	0.6

预期结果它将 df2 中与时间与公差（比如说 4 分钟）相匹配的每一行添加到 df1 的行中。

df3

Time	A	B
01/9/2021 06:50	1	0.6
01/9/2021 06:55	2	0.2
01/9/2021 07:00	3	NAN
01/9/2021 07:05	6	NAN
01/9/2021 07:10	3	0.3
01/9/2021 07:15	2	0.6
01/9/2021 07:20	1	NAN
01/9/2021 07:25	2	NAN

非常感谢您的帮助。谢谢

Answer 1

使用 pandas.to_datetime 和 pd.Series.dt.round 的一种方法：

df["Time"] = pd.to_datetime(df["Time"])
df2["Time"] = pd.to_datetime(df2["Time"]).dt.round("5min")

new_df = df.merge(df2, on="Time", how="left")
print(new_df)

输出：

                 Time  A    B
0 2021-01-09 06:50:00  1  0.6
1 2021-01-09 06:55:00  2  0.2
2 2021-01-09 07:00:00  3  NaN
3 2021-01-09 07:05:00  6  NaN
4 2021-01-09 07:10:00  3  0.3
5 2021-01-09 07:15:00  2  0.6
6 2021-01-09 07:20:00  1  NaN
7 2021-01-09 07:25:00  2  NaN

如何根据 Python 中的索引时间序列条件将新数据集附加到现有数据集

How to append a new dataset to existing dataset based on index timeseries condition in Python

python

merge

append

time-series

pandas