如何将排序条件包含到 pivot_table 函数中？

Question

下面是我在 df 数据帧上使用 pivot_table 函数的代码。

df = pd.DataFrame({'State' : ['B','B','A','A','C', 'C'],
           'Age' : ['1 to 5', '6 to 10', '1 to 5', '6 to 10', '1 to 5', '6 to 10'],
           'Vaccinated' : [80, 20, 30, 60, 10, 15],
           'Population': [100, 100, 100, 100, 100, 100],
           'Percentage' : [0.80, 0.20, 0.30, 0.60, 0.10,0.15]})

df1 = pd.pivot_table(df,values=["Vaccinated", "Population","Percentage"],index=["State", "Age"], aggfunc=np.sum)

前面代码的结果：

                   Percentage  Population  Vaccinated
State Age                                        
A     1 to 5         0.30         100          30
      6 to 10        0.60         100          60
B     1 to 5         0.80         100          80
      6 to 10        0.20         100          20
C     1 to 5         0.10         100          10
      6 to 10        0.15         100          15

但是，我想对我的记录进行排序，使状态 B 位于顶部，然后是 A，然后是 C。合理的是因为 B 国 100% 接种了疫苗 (60%+40%)，A 国有 90% (60%+30%) 而 C 国有 25%。尝试添加排序几次，但我遇到了错误。

我可以寻求建议，如何在 pivot_table 期间或之后添加排序标准，以便获得以下结果吗？

               Percentage  Population  Vaccinated
State Age                                        
B     1 to 5         0.80         100          80
      6 to 10        0.20         100          20
A     1 to 5         0.30         100          30
      6 to 10        0.60         100          60
C     1 to 5         0.10         100          10
      6 to 10        0.15         100          15

Answer 1

一种方法是用group sum制作辅助列，根据它对df进行排序然后删除它：

df1 = df1.assign(Sum=df1.groupby(level=0).Vaccinated.transform('sum')).\
    sort_values(by='Sum', ascending=False).drop(columns=['Sum'])
print(df1)

打印：

               Percentage  Population  Vaccinated
State Age                                        
B     1 to 5         0.80         100          80
      6 to 10        0.20         100          20
A     1 to 5         0.30         100          30
      6 to 10        0.60         100          60
C     1 to 5         0.10         100          10
      6 to 10        0.15         100          15

Answer 2

我们可以在 State 级别使用 groupby sum to get the total Vaccinated per State, then sort_values to determine the order that the states should be in, then we can reindex 来根据组总数重新排序：

df1 = df1.reindex(
    index=df1.groupby(level='State')['Vaccinated'].sum()
        .sort_values(ascending=False).index,
    level='State'
)

df:

               Percentage  Population  Vaccinated
State Age                                        
B     1 to 5         0.80         100          80
      6 to 10        0.20         100          20
A     1 to 5         0.30         100          30
      6 to 10        0.60         100          60
C     1 to 5         0.10         100          10
      6 to 10        0.15         100          15

如何将排序条件包含到 pivot_table 函数中？

How to include a sort criteria into a pivot_table function?

python

pivot-table

pandas