获取 pandas 数据框中多列的表面加权平均值

Get surface weighted average of multiple columns in pandas dataframe

我想获取数据框中各列的表面加权平均值。我有两个表面列和两个 U 值列。我想创建一个额外的列 'U_av'(表面加权平均 U 值)和 U_av = (A1*U1 + A2*U2) / (A1+A2)。如果其中一列出现 NaN,则应返回 NaN。

初始df:

   ID  A1   A2    U1  U2
0  14   2  1.0  10.0  11
1  16   2  2.0  12.0  12
2  18   2  3.0  24.0  13
3  20   2  NaN   8.0  14
4  22   4  5.0  84.0  15
5  24   4  6.0  84.0  16

期望输出:

   ID  A1   A2    U1  U2   U_av
0  14   2  1.0  10.0  11  10.33
1  16   2  2.0  12.0  12     12
2  18   2  3.0  24.0  13   17.4
3  20   2  NaN   8.0  14    NaN
4  22   4  5.0  84.0  15  45.66
5  24   4  6.0  84.0  16   43.2

代码:

import numpy as np

import pandas as pd



df = pd.DataFrame({"ID": [14,16,18,20,22,24],
                   "A1": [2,2,2,2,4,4],
  
                   "U1": [10,12,24,8,84,84],
   
                   "A2": [1,2,3,np.nan,5,6],
   
                   "U2": [11,12,13,14,15,16]})



print(df)

#the mean of two columns U1 and U2 and dropping NaN is easy (U1+U2/2 in this case) 
#but what to do for the surface-weighted mean (U_av = (A1*U1 + A2*U2) / (A1+A2))?
df.loc[:,'Umean'] = df[['U1','U2']].dropna().mean(axis=1)



EDIT:
adding to the solutions below:
df["U_av"] = (df.A1.mul(df.U1) + df.A2.mul(df.U2)).div(df[['A1','A2']].sum(axis=1))


希望我没猜错:

df['U_av'] = (df['A1']*df['U1'] + df['A2']*df['U2']) / (df['A1']+df['A2'])
df

    ID  A1  U1  A2  U2  U_av
0   14  2   10  1.0 11  10.333333
1   16  2   12  2.0 12  12.000000
2   18  2   24  3.0 13  17.400000
3   20  2   8   NaN 14  NaN
4   22  4   84  5.0 15  45.666667
5   24  4   84  6.0 16  43.200000

试试这个代码:

numerator = df.A1.mul(df.U1) + (df.A2.mul(df.U2))
denominator = df.A1.add(df.A2)
df["U_av"] = numerator.div(denominator)

df

    ID  A1  A2  U1  U2  U_av
0   14  2   1.0 10.0    11  10.333333
1   16  2   2.0 12.0    12  12.000000
2   18  2   3.0 24.0    13  17.400000
3   20  2   NaN 8.0     14  NaN
4   22  4   5.0 84.0    15  45.666667
5   24  4   6.0 84.0    16  43.200000