我正在尝试用平均值替换 NaN 值

I am trying to replace NaN values with mean values

我必须用 jupyter notebook 中的相应方法替换 s_months 和事件 NaN 值。
输入数据:

    Types   c_years     o_periods   s_months    incidents
0   1       1           1           127.0       0.0
1   1       1           2           63.0        0.0
2   1       2           1           1095.0      3.0
3   1       2           2           1095.0      4.0
4   1       3           1           1512.0      6.0
5   1       3           2           3353.0      18.0
6   1       4           1           NaN         NaN
7   1       4           2           2244.0      11.0
14  2       4           1           NaN         NaN

我尝试了下面的代码,但它似乎不起作用,我尝试了不同的变体,例如替换转换。

df.fillna['s_months'] = df.fillna(df.grouby(['types' , 'o_periods']['s_months','incidents']).tranform('mean'),inplace = True)
                 s_months  incidents
Types o_periods                     
1     1               911          3
      2              1688          8
2     1             26851         36
      2             14440         36
3     1               914          2
      2               862          1
4     1               296          0
      2               889          3
5     1               663          4
      2              1046          6

来自你的 DataFrame :

>>> import pandas as pd
>>> from io import StringIO

>>> df = pd.read_csv(StringIO("""
Types,c_years,o_periods,s_months,incidents
0,1,1,1,127.0,0.0
1,1,1,2,63.0,0.0
2,1,2,1,1095.0,3.0
3,1,2,2,1095.0,4.0
4,1,3,1,1512.0,6.0
5,1,3,2,3353.0,18.0
6,1,4,1,NaN,NaN
7,1,4,2,2244.0,11.0
14,2,4,1,NaN,NaN"""), sep=',')
>>> df
    Types   c_years     o_periods   s_months    incidents
0   1       1           1           127.0       0.0
1   1       1           2           63.0        0.0
2   1       2           1           1095.0      3.0
3   1       2           2           1095.0      4.0
4   1       3           1           1512.0      6.0
5   1       3           2           3353.0      18.0
6   1       4           1           NaN         NaN
7   1       4           2           2244.0      11.0
14  2       4           1           NaN         NaN
>>> df[['c_years', 's_months', 'incidents']] = df.groupby(['Types', 'o_periods']).transform(lambda x: x.fillna(x.mean()))
>>> df
    Types   c_years     o_periods   s_months    incidents
0   1             1     1           127.000000      0.0
1   1             1     2           63.000000       0.0
2   1             2     1           1095.000000     3.0
3   1             2     2           1095.000000     4.0
4   1             3     1           1512.000000     6.0
5   1             3     2           3353.000000     18.0
6   1             4     1           911.333333      3.0
7   1             4     2           2244.000000     11.0
14  2             4     1           NaN             NaN

最后一个 NaN 在这里是因为它属于最后一个组,该组在 s_monthsincidents 列中不包含任何值,因此没有 mean

试试这个 df['s_months'].fillna(df['s_months'].mean())
df['s_months'].mean() 计数均值没有 Nan.

您的代码很接近,您可以尝试修改如下使其生效:

df[['s_months','incidents']] = df[['s_months','incidents']].fillna(df.groupby(['Types' , 'o_periods'])[['s_months','incidents']].transform('mean'))

数据输入:

    Types  c_years  o_periods  s_months  incidents
0       1        1          1     127.0        0.0
1       1        1          2      63.0        0.0
2       1        2          1    1095.0        3.0
3       1        2          2    1095.0        4.0
4       1        3          1    1512.0        6.0
5       1        3          2    3353.0       18.0
6       1        4          1       NaN        NaN
7       1        4          2    2244.0       11.0
14      2        4          1       NaN        NaN

输出

    Types  c_years  o_periods     s_months  incidents
0       1        1          1   127.000000        0.0
1       1        1          2    63.000000        0.0
2       1        2          1  1095.000000        3.0
3       1        2          2  1095.000000        4.0
4       1        3          1  1512.000000        6.0
5       1        3          2  3353.000000       18.0
6       1        4          1   911.333333        3.0
7       1        4          2  2244.000000       11.0
14      2        4          1          NaN        NaN