Pandas:将每月数据的第 1 个数据帧除以具有每月平均值的第 2 个数据帧

Pandas: Divide 1 dataframe of monthly data by 2nd dataframe with monthly averages

我在数据框 df1 中有月度数据,我需要除以第二个数据框 df2 来计算可变性。我的 df1 如下所示。我试图找到答案,但看不出如何将每个月度值除以此处给出的独特的月度长期 (2019-2021) 平均值。

index plant_name  year      month   wind_speed_ms
324  CAETITE III  2019      1       8.188548
325  CAETITE III  2019      2       5.506577
326  CAETITE III  2019      3       6.449032
327  CAETITE III  2019      4       6.077875
328  CAETITE III  2019      5       6.932110
329  CAETITE III  2019      6       8.905403
330  CAETITE III  2019      7       8.180067
331  CAETITE III  2019      8       9.252164
332  CAETITE III  2019      9       8.850250
333  CAETITE III  2019     10       8.013858
334  CAETITE III  2019     11       6.940972
335  CAETITE III  2019     12       7.244973
336  CAETITE III  2020      1       5.690524
337  CAETITE III  2020      2       5.782543
338  CAETITE III  2020      3       6.387433
339  CAETITE III  2020      4       7.097597
340  CAETITE III  2020      5       7.142957
341  CAETITE III  2020      6       8.653625
342  CAETITE III  2020      7       8.509987
343  CAETITE III  2020      8       9.164261
344  CAETITE III  2020      9       9.219514
345  CAETITE III  2020     10       7.110269
346  CAETITE III  2020     11       6.884597
347  CAETITE III  2020     12       6.144812
348  CAETITE III  2021      1       7.082836
349  CAETITE III  2021      2       5.706280
350  CAETITE III  2021      3       7.029341
351  CAETITE III  2021      4       6.676764
352  CAETITE III  2021      5       7.318804
353  CAETITE III  2021      6       7.403028
354  CAETITE III  2021      7       7.811183
355  CAETITE III  2021      8       9.243642

我的第二个数据框 - df2 - 看起来像这样,这是月平均值或分母:

df2:

     plant_name  month  wind_speed_ms
0   CAETITE III      1       6.764807
1   CAETITE III      2       6.624946
2   CAETITE III      3       6.661645
3   CAETITE III      4       7.127637
4   CAETITE III      5       7.361365
5   CAETITE III      6       8.416312
6   CAETITE III      7       8.746169
7   CAETITE III      8       8.915520
8   CAETITE III      9       8.824938
9   CAETITE III     10       8.037235
10  CAETITE III     11       6.810558
11  CAETITE III     12       6.435099

答案需要如下所示:

index   plant_name  year    month   var
0   CAETITE III 2019    1   21.046
1   CAETITE III 2019    2   -16.881
2   CAETITE III 2019    3   -3.192
3   CAETITE III 2019    4   -14.728
4   CAETITE III 2019    5   -5.831
5   CAETITE III 2019    6   5.811
6   CAETITE III 2019    7   -6.473
7   CAETITE III 2019    8   3.776
8   CAETITE III 2019    9   0.287
9   CAETITE III 2019    10  -0.291
10  CAETITE III 2019    11  1.915
11  CAETITE III 2019    12  12.585
12  CAETITE III 2020    1   -15.880
13  CAETITE III 2020    2   -12.716
14  CAETITE III 2020    3   -4.116
15  CAETITE III 2020    4   -0.421
16  CAETITE III 2020    5   -2.967
17  CAETITE III 2020    6   2.820
18  CAETITE III 2020    7   -2.700
19  CAETITE III 2020    8   2.790
20  CAETITE III 2020    9   4.471
21  CAETITE III 2020    10  -11.533
22  CAETITE III 2020    11  1.087
23  CAETITE III 2020    12  -4.511
24  CAETITE III 2021    1   4.701
25  CAETITE III 2021    2   -13.867
26  CAETITE III 2021    3   5.520
27  CAETITE III 2021    4   -6.326
28  CAETITE III 2021    5   -0.578
29  CAETITE III 2021    6   -12.040
30  CAETITE III 2021    7   -10.690
31  CAETITE III 2021    8   3.680

谢谢,

您可以匹配df1df2month,使用.map()得到df2的平均风速。然后,将df1的风速除以刚才映射的平均风速减1再乘以100,得到var需要的值,如下:

df1['var'] = (df1['wind_speed_ms'] / df1['month'].map(df2.set_index('month')['wind_speed_ms']) - 1) * 100

结果:

print(df1)

     plant_name  year  month  wind_speed_ms        var
0   CAETITE III  2019      1       8.188548  21.046291
1   CAETITE III  2019      2       5.506577 -16.881179
2   CAETITE III  2019      3       6.449032  -3.191599
3   CAETITE III  2019      4       6.077875 -14.728051
4   CAETITE III  2019      5       6.932110  -5.831188
5   CAETITE III  2019      6       8.905403   5.811227
6   CAETITE III  2019      7       8.180067  -6.472571
7   CAETITE III  2019      8       9.252164   3.775932
8   CAETITE III  2019      9       8.850250   0.286824
9   CAETITE III  2019     10       8.013858  -0.290859
10  CAETITE III  2019     11       6.940972   1.914880
11  CAETITE III  2019     12       7.244973  12.585261
12  CAETITE III  2020      1       5.690524 -15.880468
13  CAETITE III  2020      2       5.782543 -12.715621
14  CAETITE III  2020      3       6.387433  -4.116281
15  CAETITE III  2020      4       7.097597  -0.421458
16  CAETITE III  2020      5       7.142957  -2.966950
17  CAETITE III  2020      6       8.653625   2.819679
18  CAETITE III  2020      7       8.509987  -2.700405
19  CAETITE III  2020      8       9.164261   2.789977
20  CAETITE III  2020      9       9.219514   4.471148
21  CAETITE III  2020     10       7.110269 -11.533394
22  CAETITE III  2020     11       6.884597   1.087121
23  CAETITE III  2020     12       6.144812  -4.510995
24  CAETITE III  2021      1       7.082836   4.701228
25  CAETITE III  2021      2       5.706280 -13.866770
26  CAETITE III  2021      3       7.029341   5.519598
27  CAETITE III  2021      4       6.676764  -6.325701
28  CAETITE III  2021      5       7.318804  -0.578167
29  CAETITE III  2021      6       7.403028 -12.039525
30  CAETITE III  2021      7       7.811183 -10.690235
31  CAETITE III  2021      8       9.243642   3.680346