通过对 Pandas 中的值进行分组来获取计数和百分比

Get the count and percentage by grouping values in Pandas

我在 pandas、

中有以下数据框
Score   Risk
30      High Risk
50      Medium Risk
70      Medium Risk
40      Medium Risk
80      Low Risk
35      High Risk
65      Medium Risk
90      Low Risk

我想获取总计数、按计数分组并按风险列的值百分比,如下所示:

Expected output
Risk Category   Count   Percentage
High Risk       2       25.00
Medium Risk     4       50.00
Low Risk        2       25.00
Total           8       100.00

谁能解释一下我怎样才能达到预期的输出。

您可以使用 GroupBy.size with count percentages, join in concat,添加 total 行,如有必要,最后将索引转换为列:

s = df.groupby('Risk')['Score'].size()
df = pd.concat([s, s / s.sum() * 100], axis=1, keys=('count','Percentage'))
df.loc['Total'] = df.sum().astype(int)
print (df)
             count  Percentage
Risk                          
High Risk        2        25.0
Low Risk         2        25.0
Medium Risk      4        50.0
Total            8       100.0


df = df.rename_axis('Risk Category').reset_index()
print (df)
  Risk Category  count  Percentage
0     High Risk      2        25.0
1      Low Risk      2        25.0
2   Medium Risk      4        50.0
3         Total      8       100.0

您也可以使用 pivot_table 得到一个相当清晰的答案,因为它可以自动为您创建保证金总数。

summary = (
    df.pivot_table(
        index='Risk', aggfunc='count', margins='row', margins_name='Total'
    )
    .assign(Percentage=lambda df: df['Score'] / df.loc['Total', 'Score'] * 100)
    .rename_axis('Risk Category')
    .reset_index()
)

print(summary)
  Risk Category  Score  Percentage
0     High Risk      2        25.0
1      Low Risk      2        25.0
2   Medium Risk      4        50.0
3         Total      8       100.0