select 个成员超过 x 个的群组

Question

有没有办法从 pandas 到 select，在分组数据框中，有超过 x 个成员的组？

类似于：

grouped = df.groupby(['a', 'b'])
dupes = [g[['a', 'b', 'c', 'd']] for _, g in grouped if len(g) > 1]

我在文档或 SO 上找不到解决方案。

Answer 1

使用filter:

grouped.filter(lambda x: len(x) > 1)

示例：

In [64]:
df = pd.DataFrame({'a':[0,0,1,2],'b':np.arange(4)})
df

Out[64]:
   a  b
0  0  0
1  0  1
2  1  2
3  2  3

In [65]:
df.groupby('a').filter(lambda x: len(x)>1)

Out[65]:
   a  b
0  0  0
1  0  1

select 个成员超过 x 个的群组

select groups having more than x members

python

select

pandas

pandas-groupby