如何计算 Python 中 groupby 中行计数的百分比并生成条形图
How to calculate percentage for row counts in groupby in Python and generate bar plot
我想为每个 id 计算百分比并生成条形图。
这是我的数据示例:
id LGA Status
1 Banyule Referred
2 Hepburn Referred
3 Kingston Not Referred
4 Darebin Not Referred
5 Darebin Managed Externally
6 Darebin Managed Externally
7 Mansfield Managed Externally
8 Casey Referred
9 Mitchell Referred
10 Mitchell Not Referred
11 Moreland Referred
12 Whittlesea Not Referred
13 Glen Eira Not Referred
14 Dandenong Referred
15 Hume Not Referred
16 Hume Managed Externally
17 Campaspe Not Referred
18 Melbourne Not Referred
19 Melbourne Not Referred
我使用“groupby”函数计算了“LGA”和“Status”列的计数并生成了条形图。
示例代码;
df['Status'].value_counts().plot(kind='bar')
df['LGA'].value_counts().plot(kind='bar')
如果我有兴趣为相同的列绘制百分比并分别生成条形图,我不确定如何优雅地做到这一点。
预期输出:我使用 excel
得出以下输出
Status % of Grand Total
Not Referred 58.42%
Referred 23.68%
Managed Externally 17.89%
Grand Total 100.00%
预期条形图:
如有任何帮助,我们将不胜感激。
我相信这就是您要找的:
temp_df = (df.groupby('Status').size().sort_values(ascending=False) / df.groupby('Status').size().sort_values(ascending=False).sum())*100
ax = temp_df.plot(kind='bar')
ax.bar_label(ax.containers[0])
plt.show()
我想为每个 id 计算百分比并生成条形图。
这是我的数据示例:
id LGA Status
1 Banyule Referred
2 Hepburn Referred
3 Kingston Not Referred
4 Darebin Not Referred
5 Darebin Managed Externally
6 Darebin Managed Externally
7 Mansfield Managed Externally
8 Casey Referred
9 Mitchell Referred
10 Mitchell Not Referred
11 Moreland Referred
12 Whittlesea Not Referred
13 Glen Eira Not Referred
14 Dandenong Referred
15 Hume Not Referred
16 Hume Managed Externally
17 Campaspe Not Referred
18 Melbourne Not Referred
19 Melbourne Not Referred
我使用“groupby”函数计算了“LGA”和“Status”列的计数并生成了条形图。
示例代码;
df['Status'].value_counts().plot(kind='bar')
df['LGA'].value_counts().plot(kind='bar')
如果我有兴趣为相同的列绘制百分比并分别生成条形图,我不确定如何优雅地做到这一点。
预期输出:我使用 excel
得出以下输出Status % of Grand Total
Not Referred 58.42%
Referred 23.68%
Managed Externally 17.89%
Grand Total 100.00%
预期条形图:
如有任何帮助,我们将不胜感激。
我相信这就是您要找的:
temp_df = (df.groupby('Status').size().sort_values(ascending=False) / df.groupby('Status').size().sort_values(ascending=False).sum())*100
ax = temp_df.plot(kind='bar')
ax.bar_label(ax.containers[0])
plt.show()