如何在 pandas 中保留 group by 中的空白？

Question

我需要 groupby 我的 DataFrame 在 pandas 中，但是当我这样做时，空值正在转换为零，但我想保留空值。我不确定如何在 pandas.

中做到这一点

输入：

Id  Country  Product  sales  qty  price
1   Germany  shoes    32      1   NaN
1   Germany  shoes    32      1    2
2   England  Shoes    22      1   NaN
2   England  Shoes    22      1   NaN
3   Austria  Shoes    0       3   NaN
3   Austria  Shoes    NaN     NaN NaN

期望的输出：

Id  Country  Product  sales  qty  price
1   Germany  shoes    64      2   2
2   England  Shoes    44      2   NaN
3   Austria  Shoes    0       3   NaN

Answer 1

在sum中使用参数min_count=1:

df = df.groupby(['Id','Country','Product'], as_index=False).sum(min_count=1)
print (df)
   Id  Country Product  sales  qty  price
0   1  Germany   shoes   64.0  2.0    2.0
1   2  England   Shoes   44.0  2.0    NaN
2   3  Austria   Shoes    0.0  3.0    NaN

Answer 2

您可以 mask 使用 isna + group + all

out = (df.groupby(['Id','Country','Product']).sum()
       .mask(df[['sales','qty','price']].isna()
             .groupby([df['Id'], df['Country'], df['Product']]).all())
       .reset_index())

同一个想法写法不同：

cols = ['Id','Country','Product']
g = df.groupby(cols)
out = (g.sum()
       .mask(g.apply(lambda x: x.drop(columns=cols).isna().all()))
       .reset_index())

输出：

   Id  Country Product  sales  qty  price
0   1  Germany   shoes   64.0  2.0    2.0
1   2  England   Shoes   44.0  2.0    NaN
2   3  Austria   Shoes    0.0  3.0    NaN

如何在 pandas 中保留 group by 中的空白？

How to retain blanks in group by in pandas?

python

group-by

dataframe

pandas

pandas-groupby