在 pandas 中应用 fillna/ffill/bfill 后保留组 columns/index

Preserve group columns/index after applying fillna/ffill/bfill in pandas

我有如下数据,新的pandas版本在fillna/ffill/bfill操作后不保留分组列。有没有办法得到分组的数据?

data = """one;two;three
1;1;10
1;1;nan
1;1;nan
1;2;nan
1;2;20
1;2;nan
1;3;nan
1;3;nan"""

df = pd.read_csv(io.StringIO(data), sep=";")
print(df)
   one  two  three
0    1    1   10.0
1    1    1    NaN
2    1    1    NaN
3    1    2    NaN
4    1    2   20.0
5    1    2    NaN
6    1    3    NaN
7    1    3    NaN

print(df.groupby(['one','two']).ffill())
   three
0   10.0
1   10.0
2   10.0
3    NaN
4   20.0
5   20.0
6    NaN
7    NaN

是否符合您的预期?

df['three']= df.groupby(['one','two'])['three'].ffill()
print(df)

# Output:
   one  two  three
0    1    1   10.0
1    1    1   10.0
2    1    1   10.0
3    1    2    NaN
4    1    2   20.0
5    1    2   20.0
6    1    3    NaN
7    1    3    NaN

是的,请设置索引,然后尝试对其进行分组,这样它将保留如下所示的列:

df = pd.read_csv(io.StringIO(data), sep=";")
df.set_index(['one','two'], inplace=True)
df.groupby(['one','two']).ffill()

最近的 pandas 如果我们想保留 groupby 列,我们需要在此处添加 apply

out = df.groupby(['one','two']).apply(lambda x : x.ffill())
Out[219]: 
   one  two  three
0    1    1   10.0
1    1    1   10.0
2    1    1   10.0
3    1    2    NaN
4    1    2   20.0
5    1    2   20.0
6    1    3    NaN
7    1    3    NaN