使用 groupby 创建百分比堆叠条形图
Creating percentage stacked bar chart using groupby
我正在查看不同贷款状态级别内的房屋所有权,我想使用百分比堆叠条形图来显示它。
我已经能够使用以下代码创建频率堆积条形图:
df_trunc1=df[['loan_status','home_ownership','id']]
sub_df1=df_trunc1.groupby(['loan_status','home_ownership'])['id'].count()
sub_df1.unstack().plot(kind='bar',stacked=True,rot=1,figsize=(8,8),title="Home ownership across Loan Types")
这给了我这张照片:1
但我不知道如何将图表转换为百分比。因此,例如,我想进入默认组,哪个百分比有抵押贷款,哪个拥有等等。
这是我的 groupby table for context2:
谢谢!!
我认为您需要自己转换百分比:
d = {('Default', 'MORTGAGE'): 498, ('Default', 'OWN'): 110, ('Default', 'RENT'): 611, ('Fully Paid', 'MORTGAGE'): 3100, ('Fully Paid', 'NONE'): 1, ('Fully Paid', 'OTHER'): 5, ('Fully Paid', 'OWN'): 558, ('Fully Paid', 'RENT'): 2568, ('Late (16-30 days)', 'MORTGAGE'): 1101, ('Late (16-30 days)', 'OWN'): 260, ('Late (16-30 days)', 'RENT'): 996, ('Late (31-120 days)', 'MORTGAGE'): 994, ('Late (31-120 days)', 'OWN'): 243, ('Late (31-120 days)', 'RENT'): 1081}
sub_df1 = pd.DataFrame(d.values(), columns=['count'], index=pd.MultiIndex.from_tuples(d.keys()))
sub_df2 = sub_df1.unstack()
sub_df2.columns = sub_df2.columns.droplevel() # Drop `count` label.
sub_df2 = sub_df2.div(sub_df2.sum())
sub_df2.T.plot(kind='bar', stacked=True, rot=1, figsize=(8, 8),
title="Home ownership across Loan Types")
sub_df3 = sub_df1.unstack().T
sub_df3.index = sub_df3.index.droplevel() # Drop `count` label.
sub_df3 = sub_df3.div(sub_df3.sum())
sub_df3.T.plot(kind='bar', stacked=True, rot=1, figsize=(8, 8),
title="Home ownership across Loan Types")
我通过两次转置数据帧来计算百分比。是不是一步步把逻辑展示的更清楚
#transpose
to_plot =sub_df1.unstack()
to_plot_transpose = to_plot.transpose()
#calc %
to_plot_transpose_pct = to_plot_transpose.div(to_plot_transpose.sum())
#transpose back
to_plot_pct=to_plot_transpose_pct.transpose()
#plot
to_plot_pct.plot(kind='bar',stacked=True,rot=1,figsize= .
(8,8),title="Home ownership across Loan Types")
我正在查看不同贷款状态级别内的房屋所有权,我想使用百分比堆叠条形图来显示它。
我已经能够使用以下代码创建频率堆积条形图:
df_trunc1=df[['loan_status','home_ownership','id']]
sub_df1=df_trunc1.groupby(['loan_status','home_ownership'])['id'].count()
sub_df1.unstack().plot(kind='bar',stacked=True,rot=1,figsize=(8,8),title="Home ownership across Loan Types")
这给了我这张照片:1
但我不知道如何将图表转换为百分比。因此,例如,我想进入默认组,哪个百分比有抵押贷款,哪个拥有等等。
这是我的 groupby table for context2:
谢谢!!
我认为您需要自己转换百分比:
d = {('Default', 'MORTGAGE'): 498, ('Default', 'OWN'): 110, ('Default', 'RENT'): 611, ('Fully Paid', 'MORTGAGE'): 3100, ('Fully Paid', 'NONE'): 1, ('Fully Paid', 'OTHER'): 5, ('Fully Paid', 'OWN'): 558, ('Fully Paid', 'RENT'): 2568, ('Late (16-30 days)', 'MORTGAGE'): 1101, ('Late (16-30 days)', 'OWN'): 260, ('Late (16-30 days)', 'RENT'): 996, ('Late (31-120 days)', 'MORTGAGE'): 994, ('Late (31-120 days)', 'OWN'): 243, ('Late (31-120 days)', 'RENT'): 1081}
sub_df1 = pd.DataFrame(d.values(), columns=['count'], index=pd.MultiIndex.from_tuples(d.keys()))
sub_df2 = sub_df1.unstack()
sub_df2.columns = sub_df2.columns.droplevel() # Drop `count` label.
sub_df2 = sub_df2.div(sub_df2.sum())
sub_df2.T.plot(kind='bar', stacked=True, rot=1, figsize=(8, 8),
title="Home ownership across Loan Types")
sub_df3 = sub_df1.unstack().T
sub_df3.index = sub_df3.index.droplevel() # Drop `count` label.
sub_df3 = sub_df3.div(sub_df3.sum())
sub_df3.T.plot(kind='bar', stacked=True, rot=1, figsize=(8, 8),
title="Home ownership across Loan Types")
我通过两次转置数据帧来计算百分比。是不是一步步把逻辑展示的更清楚
#transpose
to_plot =sub_df1.unstack()
to_plot_transpose = to_plot.transpose()
#calc %
to_plot_transpose_pct = to_plot_transpose.div(to_plot_transpose.sum())
#transpose back
to_plot_pct=to_plot_transpose_pct.transpose()
#plot
to_plot_pct.plot(kind='bar',stacked=True,rot=1,figsize= .
(8,8),title="Home ownership across Loan Types")