基于多列排序的RANK
RANK based on SORT by multiple columns
我正在尝试按 ID_1 分组并按 ID_2 降序和 TotalRevenue 升序排序以提取排名
请协助如何在RANK排序函数中同时使用升序和降序功能
import pandas as pd
df = pd.DataFrame({
'ID_1':[1,1,1,2,2,2,3,3],
'ID_2':[100,100,35,30,30,20,50,50],
'TotalRevenue':[9000,2000,750,1000,600,500,500,300]})
df['RANK']= df.groupby(['ID_1'])['ID_2','TotalRevenue'].rank(method='dense',ascending=False).astype(int)
Desired output:
ID_1 ID_2 TotalRevenue Rank
1 100 9000 2
1 100 2000 1
1 35 750 3
2 30 1000 2
2 30 600 1
2 20 500 3
3 50 500 2
3 50 300 1
你可以sort_values
并根据组分配cumcount
+1
out = df.sort_values(['ID_2','TotalRevenue'],ascending=[False,True])
out['Rank'] = out.groupby("ID_1").cumcount()+1
print(out.sort_index())
ID_1 ID_2 TotalRevenue Rank
0 1 100 9000 2
1 1 100 2000 1
2 1 35 750 3
3 2 30 1000 2
4 2 30 600 1
5 2 20 500 3
6 3 50 500 2
7 3 50 300 1
我正在尝试按 ID_1 分组并按 ID_2 降序和 TotalRevenue 升序排序以提取排名
请协助如何在RANK排序函数中同时使用升序和降序功能
import pandas as pd
df = pd.DataFrame({
'ID_1':[1,1,1,2,2,2,3,3],
'ID_2':[100,100,35,30,30,20,50,50],
'TotalRevenue':[9000,2000,750,1000,600,500,500,300]})
df['RANK']= df.groupby(['ID_1'])['ID_2','TotalRevenue'].rank(method='dense',ascending=False).astype(int)
Desired output:
ID_1 ID_2 TotalRevenue Rank
1 100 9000 2
1 100 2000 1
1 35 750 3
2 30 1000 2
2 30 600 1
2 20 500 3
3 50 500 2
3 50 300 1
你可以sort_values
并根据组分配cumcount
+1
out = df.sort_values(['ID_2','TotalRevenue'],ascending=[False,True])
out['Rank'] = out.groupby("ID_1").cumcount()+1
print(out.sort_index())
ID_1 ID_2 TotalRevenue Rank
0 1 100 9000 2
1 1 100 2000 1
2 1 35 750 3
3 2 30 1000 2
4 2 30 600 1
5 2 20 500 3
6 3 50 500 2
7 3 50 300 1