显示比例和填充不起作用的条形图

Display a bar chart with proportions and fill not working

我正在使用 plotnine 绘制一些图。当我尝试显示比例条形图而不是计数时,fill 参数变得无用。我注意到删除 group=1 参数有助于使 fill 参数再次“激活”。但是,如果没有 group=1 参数,则无法正确计算比例。

这是我的函数:

def plot_churn(df_):
   color_dict = {
       'Stayed': 'green',
       'Churned': 'red'
   }

   myplot = ggplot(data=df_, mapping=aes(x='Flag_Churned', fill='Flag_Churned'))
   myplot += geom_bar(mapping=aes(y="stat(prop)", group=1))
   myplot += theme(subplots_adjust={'right': 0.71})
   myplot += facet_wrap('Flag_Treat')
   myplot += scale_fill_manual(color_dict)
   myplot += scale_y_continuous(labels=percent_format())
   print(myplot)

例如,当使用以下 pandas DataFrame 时:

data = {'Churn': [0,0,0,1,1,0,1,1], 'Flag_Treat': ['treated','treated','treated','treated','not treated','not treated','not treated','not treated'],
    'Flag_Churned': ['Stayed', 'Stayed', 'Stayed', 'Churned', 'Churned', 'Stayed', 'Churned', 'Churned']}
df = pd.DataFrame(data=data)

结果输出未被'Flag_Churned'填充:

我做错了什么?

问题是 stat(prop) 计算每个方面的道具。虽然设置 group 美学将为您提供正确的道具,但它会覆盖 fill 的分组。有 R 背景,我知道如何在 R 中即时进行此计算。但是,R 中建议的更简单的方法和大多数时间是在将数据传递给 ggplot 之前聚合数据并利用 geom_col 而不是 geom_bar:

from mizani.formatters import percent_format
from plotnine import *
import pandas as pd
import numpy as np

data = {'Churn': [0,0,0,1,1,0,1,1], 'Flag_Treat': ['treated','treated','treated','treated','not treated','not treated','not treated','not treated'],
    'Flag_Churned': ['Stayed', 'Stayed', 'Stayed', 'Churned', 'Churned', 'Stayed', 'Churned', 'Churned']}
df = pd.DataFrame(data=data)

df_.group_by(['Flag_Churned', 'Flag_Treat']).agg(len)

color_dict = {
  'Stayed': 'green',
  'Churned': 'red'
}

def plot_churn(df_):
   color_dict = {
       'Stayed': 'green',
       'Churned': 'red'
   }
                                                 
   df_ = df_.groupby(['Flag_Churned', 'Flag_Treat']).agg(len)
   df_ = df_.groupby(level=0).apply(lambda x: x / float(x.sum())).reset_index()
  
   myplot = ggplot(data=df_, mapping=aes(x='Flag_Churned', y='Churn', fill='Flag_Churned'))
   myplot += geom_col()
   myplot += theme(subplots_adjust={'right': 0.71})
   myplot += facet_wrap('Flag_Treat')
   myplot += scale_fill_manual(color_dict)
   myplot += scale_y_continuous(labels=percent_format())
   print(myplot)

plot_churn(df)