为什么 pandas 会显示 group by 语句中的删除行？

Question

我有一个简单的 pandas 数据框：

       A     B
0   test  fast
1  train  slow
2   test  fast
3  train  slow

现在我删除列 A == 测试的行

df2.drop(df2[df2['A'] == 'test'].index, inplace=True)

并得到结果：

       A     B
1  train  slow
3  train  slow

现在我运行 groupby 语句：

df2.groupby('A').B.count()

并得到结果：

A
test     0
train    2

为什么我的测试输出已经被删除了？如何避免这种情况？

谢谢，西蒙

Answer 1

在你的例子中，A 的数据类型是 category，见下文：

将 A 转换为类别后，我得到了与您相同的结果

df.A=df.A.astype('category')
df1=df.drop(df[df['A'] == 'test'].index)
df1.groupby('A').B.count()

A
test     0
train    2
Name: B, dtype: int64

要获得您想要的输出，只需将原始 df.A 转换为 string:

df.A=df.A.astype('str')
df1=df.drop(df[df['A'] == 'test'].index)
df1.groupby('A').B.count()

Out[201]: 
A
train    2
Name: B, dtype: int64

Why does pandas show delete rows in the groupby statment?