减去 2 个具有不同行数的数据框并考虑日期列
Subtract 2 dataframes with different number of rows and taking in consideration the Date column
我正在编写以下代码:
df1 = pd.DataFrame()
df1['Date'] = ["29/07/2021", "29/07/2021", "29/07/2021", "29/07/2021", "30/07/2021", "30/07/2021", "30/07/2021", "30/07/2021", "31/07/2021", "31/07/2021", "01/08/2021", "01/08/2021", "02/08/2021"]
df1['Time'] = ["06:48:00", "06:52:00", "06:56:00", "06:59:00", "07:14:00", "07:24:00", "07:40:00", "07:45:00", "08:42:00", "08:45:00", "08:52:00", "08:55:00", "09:07:00"]
df1['Column1'] = [0.0001, 0.002, 0.004, 0.5, 0.005, 0.0006, 0.08, 0.07, 0.003, 0.02, 0.0002, 0.0045, 0.0034]
df2 = pd.DataFrame()
df2['Date'] = ["29/07/2021", "30/07/2021", "31/07/2021", "01/08/2021", "02/08/2021"]
df2['Column1'] = [0.0056, 0.0594, 0.959, 0.0034, 0.00065]
我想从 df1['Column1] 值中减去 df2['Column1'] 值,对于新数据帧中的每一天。
例如,对于第一个日期(2021 年 7 月 29 日),我们将有一个这样的数据框:
Date New_Column
29/07/2021 0.0001-0.0056
29/07/2021 0.002-0.0056
29/07/2021 0.004-0.0056
29/07/2021 0.5-0.0056
以此类推,对于每个日期,我们将从与 df1 中某个日期相关的不同值中减去与 df2 中某个日期相关的一个值。
谢谢。
如果 df2
中没有重复日期,那么我们可以创建一个从日期到值的映射,然后 map
它到 df1
中的 Date
列。那么我们可以简单的减去相应的值:
df1['New Column'] = df1['Column1'] - df1['Date'].map(df2.set_index('Date')['Column1'])
输出:
Date Time Column1 New Column
0 29/07/2021 06:48:00 0.0001 -0.00550
1 29/07/2021 06:52:00 0.0020 -0.00360
2 29/07/2021 06:56:00 0.0040 -0.00160
3 29/07/2021 06:59:00 0.5000 0.49440
4 30/07/2021 07:14:00 0.0050 -0.05440
5 30/07/2021 07:24:00 0.0006 -0.05880
6 30/07/2021 07:40:00 0.0800 0.02060
7 30/07/2021 07:45:00 0.0700 0.01060
8 31/07/2021 08:42:00 0.0030 -0.95600
9 31/07/2021 08:45:00 0.0200 -0.93900
10 01/08/2021 08:52:00 0.0002 -0.00320
11 01/08/2021 08:55:00 0.0045 0.00110
12 02/08/2021 09:07:00 0.0034 0.00275
我正在编写以下代码:
df1 = pd.DataFrame()
df1['Date'] = ["29/07/2021", "29/07/2021", "29/07/2021", "29/07/2021", "30/07/2021", "30/07/2021", "30/07/2021", "30/07/2021", "31/07/2021", "31/07/2021", "01/08/2021", "01/08/2021", "02/08/2021"]
df1['Time'] = ["06:48:00", "06:52:00", "06:56:00", "06:59:00", "07:14:00", "07:24:00", "07:40:00", "07:45:00", "08:42:00", "08:45:00", "08:52:00", "08:55:00", "09:07:00"]
df1['Column1'] = [0.0001, 0.002, 0.004, 0.5, 0.005, 0.0006, 0.08, 0.07, 0.003, 0.02, 0.0002, 0.0045, 0.0034]
df2 = pd.DataFrame()
df2['Date'] = ["29/07/2021", "30/07/2021", "31/07/2021", "01/08/2021", "02/08/2021"]
df2['Column1'] = [0.0056, 0.0594, 0.959, 0.0034, 0.00065]
我想从 df1['Column1] 值中减去 df2['Column1'] 值,对于新数据帧中的每一天。
例如,对于第一个日期(2021 年 7 月 29 日),我们将有一个这样的数据框:
Date New_Column
29/07/2021 0.0001-0.0056
29/07/2021 0.002-0.0056
29/07/2021 0.004-0.0056
29/07/2021 0.5-0.0056
以此类推,对于每个日期,我们将从与 df1 中某个日期相关的不同值中减去与 df2 中某个日期相关的一个值。
谢谢。
如果 df2
中没有重复日期,那么我们可以创建一个从日期到值的映射,然后 map
它到 df1
中的 Date
列。那么我们可以简单的减去相应的值:
df1['New Column'] = df1['Column1'] - df1['Date'].map(df2.set_index('Date')['Column1'])
输出:
Date Time Column1 New Column
0 29/07/2021 06:48:00 0.0001 -0.00550
1 29/07/2021 06:52:00 0.0020 -0.00360
2 29/07/2021 06:56:00 0.0040 -0.00160
3 29/07/2021 06:59:00 0.5000 0.49440
4 30/07/2021 07:14:00 0.0050 -0.05440
5 30/07/2021 07:24:00 0.0006 -0.05880
6 30/07/2021 07:40:00 0.0800 0.02060
7 30/07/2021 07:45:00 0.0700 0.01060
8 31/07/2021 08:42:00 0.0030 -0.95600
9 31/07/2021 08:45:00 0.0200 -0.93900
10 01/08/2021 08:52:00 0.0002 -0.00320
11 01/08/2021 08:55:00 0.0045 0.00110
12 02/08/2021 09:07:00 0.0034 0.00275