减去 2 个具有不同行数的数据框并考虑日期列

Subtract 2 dataframes with different number of rows and taking in consideration the Date column

我正在编写以下代码:

df1 = pd.DataFrame()
df1['Date'] = ["29/07/2021", "29/07/2021", "29/07/2021", "29/07/2021", "30/07/2021", "30/07/2021", "30/07/2021", "30/07/2021", "31/07/2021", "31/07/2021", "01/08/2021", "01/08/2021", "02/08/2021"]
df1['Time'] = ["06:48:00", "06:52:00", "06:56:00", "06:59:00", "07:14:00", "07:24:00", "07:40:00", "07:45:00", "08:42:00", "08:45:00", "08:52:00", "08:55:00", "09:07:00"]
df1['Column1'] = [0.0001, 0.002, 0.004, 0.5, 0.005, 0.0006, 0.08, 0.07, 0.003, 0.02, 0.0002, 0.0045, 0.0034]

df2 = pd.DataFrame()
df2['Date'] = ["29/07/2021", "30/07/2021", "31/07/2021", "01/08/2021", "02/08/2021"]
df2['Column1'] = [0.0056, 0.0594, 0.959, 0.0034, 0.00065]

我想从 df1['Column1] 值中减去 df2['Column1'] 值,对于新数据帧中的每一天。

例如,对于第一个日期(2021 年 7 月 29 日),我们将有一个这样的数据框:

Date       New_Column
29/07/2021 0.0001-0.0056
29/07/2021 0.002-0.0056
29/07/2021 0.004-0.0056
29/07/2021 0.5-0.0056

以此类推,对于每个日期,我们将从与 df1 中某个日期相关的不同值中减去与 df2 中某个日期相关的一个值。

谢谢。

如果 df2 中没有重复日期,那么我们可以创建一个从日期到值的映射,然后 map 它到 df1 中的 Date 列。那么我们可以简单的减去相应的值:

df1['New Column'] = df1['Column1'] - df1['Date'].map(df2.set_index('Date')['Column1'])

输出:

          Date      Time  Column1  New Column
0   29/07/2021  06:48:00   0.0001    -0.00550
1   29/07/2021  06:52:00   0.0020    -0.00360
2   29/07/2021  06:56:00   0.0040    -0.00160
3   29/07/2021  06:59:00   0.5000     0.49440
4   30/07/2021  07:14:00   0.0050    -0.05440
5   30/07/2021  07:24:00   0.0006    -0.05880
6   30/07/2021  07:40:00   0.0800     0.02060
7   30/07/2021  07:45:00   0.0700     0.01060
8   31/07/2021  08:42:00   0.0030    -0.95600
9   31/07/2021  08:45:00   0.0200    -0.93900
10  01/08/2021  08:52:00   0.0002    -0.00320
11  01/08/2021  08:55:00   0.0045     0.00110
12  02/08/2021  09:07:00   0.0034     0.00275