pandas: 如果列大于 x select 如何得到两列的最大值否则 select 是什么意思？

Question

我有一个看起来像这样的 df，我想添加一个 adj 意思是如果两列之一（avg 或 rolling_mean）为 0 则选择最大值，否则它会得到两列的平均值.

 ID Avg  rolling_mean   adj_mean (goal to have this column)

 0  5    0              5

 1  6    6.3            6.15

 2  5    8              6.5

 3  4    0              4

我能够使用此代码获得列的最大值

 df["adj_mean"]=df[["Avg", "rolling_mean"]].max(axis=1)

但不确定如果两个值都大于零，如何添加平均值。

非常感谢！

Answer 1

一种方法是将 0 视为 NaN，然后简单地计算平均值

df['adj_mean'] = df.replace({0: np.nan})[["Avg", "rolling_mean"]].mean(axis=1)

Out[1]: 
   rolling_mean  Avg  adj_mean
0           0.0    5      5.00
1           6.3    6      6.15
2           8.0    5      6.50
3           0.0    4      4.00

默认情况下，df.mean() 会跳过空值。根据 docs:

skipna : bool, default True Exclude NA/null values when computing the result.

pandas: 如果列大于 x select 如何得到两列的最大值否则 select 是什么意思？

pandas: how to get if column is greater than x select the max of two columns otherwise select mean?

python

dataset

np

pandas

data-science