pandas select 每个多索引组的前 N 个值

Question

我有数据框

data = {'fruit': ['pear','pear','pear','banana', 'banana', 'banana', 'cherry', 'pear','cherry','pear','banana', 'banana', 'banana','banana', 'cherry', 'cherry','banana', 'cherry', 'cherry', 'cherry', 'cherry'],
'country': ['france','france', 'france', 'albania', 'albania', 'albania','france', 'france','france','france', 'albania', 'albania','france','france', 'france', 'france','france', 'france', 'france', 'france', 'armenia'],
'id': ['01','01','01','01','01','01','02','02','03','03','011', '011', '011','011', '6', '6','6', '5', '5', '5','5'],
'month1': ['january','november','january','january','january','january','january', 'november','march','march', 'november', 'march', 'january','january', 'march', 'january','november', 'march', 'march', 'november','july'],
'month': ['january','november','january','january','january','january','january', 'november','march','march', 'november', 'march', 'january','january', 'march', 'january','november', 'march', 'march', 'november','july']        
}
df = pd.DataFrame(data, columns = ['fruit','country', 'id','month1', 'month'])

我用 df.pivot_table(values='month', index=['fruit','country'], columns='month1', aggfunc='count').reset_index() 制作了枢轴 table，在这里我得到了每个多索引组（水果和国家/地区）

我需要为每个组获取前 3 个值，但它可以是每 N 个值。谁能看出问题

输出数据帧

Answer 1

请检查您是否能够使用此格式：

N = 3 #for N largest
df = df.groupby(["fruit", "country", "month"]).count()["month1"].rename("count")
df = df.groupby(["fruit", "country"]).nlargest(N)
df.index = df.index.droplevel([0,1])
df = df.reset_index()
df
>>
     fruit  country     month  count
0   banana  albania   january       3
1   banana  albania     march       1
2   banana  albania  november       1
3   banana   france   january       2
4   banana   france  november       1
5   cherry  armenia      july       1
6   cherry   france     march       4
7   cherry   france   january       2
8   cherry   france  november       1
9     pear   france   january       2
10    pear   france  november       2
11    pear   france     march       1

pandas select 每个多索引组的前 N 个值

pandas select top N values for each multi index group

python

pivot

pivot-table

dataframe

pandas

pandas select 每个多索引组的前 N ​​个值

pandas select top N values for each multi index group

python

pivot

pivot-table

dataframe

pandas

pandas select 每个多索引组的前 N 个值