将一列的选定值替换为另一列的中值，但有条件

Question

所以，我希望你知道著名的泰坦尼克号问题。到目前为止，这是我通过学习本教程所做的。现在我想用 Age 列的一部分的中值替换列的 NaN 值：Age。但是选中的部分要有一定的“Title”值

例如，我想替换 Title="Mr" 中的 NaN of Age，因此，"Mr" 的中值将填充在 Title=="Mr" 的缺失位置。

我试过这个：

for val in data["Title"].unique():
    median_age = data.loc[data.Title == val, "Age"].median()
    data.loc[data.Title == val, "Age"].fillna(median_age, inplace=True)

但年龄仍然显示为 NaN。我该怎么做？

Answer 1

使用combine_first填充NaN。我的数据集中没有 Title 列，但它是相同的：

df['Age'] = df['Age'].combine_first(df.groupby('Sex')['Age'].transform('median'))

将一列的选定值替换为另一列的中值，但有条件

Replace selected values of one column with median value of another column but with condition

pandas

python-3.7