pandas DF 中以百分比和计数值作为标签的马赛克图

Question

我有这样的 pandas 数据框：

     LEVEL_1      LEVEL_2    Freq  Percentage
0       HIGH          HIGH   8842      17.684
1    AVERAGE           LOW   2802       5.604
2        LOW           LOW  22198      44.396
3    AVERAGE       AVERAGE   6804      13.608
4        LOW       AVERAGE   2030       4.060
5       HIGH       AVERAGE   3666       7.332
6    AVERAGE          HIGH   2887       5.774
7        LOW          HIGH    771       1.542

我可以得到 LEVEL_1 和 LEVEL_2 的图块：

 from statsmodels.graphics.mosaicplot import mosaic
 mosaic(df, ['LEVEL_1','LEVEL_2'])

enter image description here
我只想将 Freq 和 Percentage 放在每个马赛克图块的中心。我该怎么做？

Answer 1

这是一个开始。请注意，我必须在 DataFrame 中添加一行零以进行标记。您可以通过 lambda 函数中的字符串格式化使标签更好看。您还需要重新排序 headers.

import pandas as pd
from statsmodels.graphics.mosaicplot import mosaic
import io
d = io.StringIO()
d.write("""     LEVEL_1      LEVEL_2    Freq  Percentage\n
       HIGH          HIGH   8842      17.684\n
    AVERAGE           LOW   2802       5.604\n
        LOW           LOW  22198      44.396\n
    AVERAGE       AVERAGE   6804      13.608\n
        LOW       AVERAGE   2030       4.060\n
       HIGH       AVERAGE   3666       7.332\n
    AVERAGE          HIGH   2887       5.774\n
        LOW          HIGH    771       1.542""")
d.seek(0)
df = pd.read_csv(d, skipinitialspace=True, delim_whitespace=True)
df = df.append({'LEVEL_1': 'HIGH', 'LEVEL_2': 'LOW', 'Freq': 0, 'Percentage': 0}, ignore_index=True)
df = df.sort_values(['LEVEL_1', 'LEVEL_2'])
df = df.set_index(['LEVEL_1', 'LEVEL_2'])
print(df)

mosaic(df['Freq'], labelizer=lambda k: df.loc[k].values);

pandas DF 中以百分比和计数值作为标签的马赛克图

mosaic plot with percentage and count values as labels in pandas DF

python

plot

mosaic

pandas