如何为多个组绘制带有注释的堆积条

How to plot a stacked bar with annotations for multiple groups

在直方图中,2 个柱之间出现间隙.. 任何人都知道为什么?

我收到这个错误:

FixedLocator 位置的数量 (11),通常来自对 set_ticks 的调用,与滴答标签的数量 (10) 不匹配。

csv 文件只有 2 列,一列是国家名称,另一列是获得的奖牌类型,每行一个奖牌及其类型和国家。

文件的link是:https://github.com/jpiedehierroa/files/blob/main/Libro1.csv

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from pathlib import Path
my_csv = Path("C:/Usersjosep/Desktop/Libro1.csv")
df = pd.read_csv("Libro1.csv", sep=',')

# or load from github repo link
url = 'https://raw.githubusercontent.com/jpiedehierroa/files/main/Libro1.csv'
df = pd.read_csv(url)    

# Prepare data
x_var = 'countries'
groupby_var = 'type'
df_agg = df.loc[:,[x_var, groupby_var]].groupby(groupby_var)
vals = [df[x_var].values.tolist() for i, df in df_agg]

# Draw
plt.figure(figsize=(10,10), dpi= 100)
colors= ("#CD7F32","silver","gold")
n, bins, patches = plt.hist(vals, df[x_var].unique().__len__(), stacked=True, density=False, color=colors[:len(vals)])

# Decoration
plt.legend(["bronze", "silver","gold"], loc="upper right")
plt.title(f"Histogram of medals achieved by ${x_var}$ colored by ${groupby_var}$ in Tokyo 2020", fontsize=18)
plt.text(2,80,"138")
plt.xlabel(x_var)
plt.ylabel("amount of medals by type")
plt.ylim(0, 130)
plt.xticks(ticks=bins, labels=np.unique(df[x_var]).tolist(), rotation=90, horizontalalignment='left')
plt.show()

测试数据

countries,type
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,gold
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,silver
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
USA,bronze
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,gold
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,silver
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
China,bronze
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,gold
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,silver
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
Japan,bronze
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,gold
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,silver
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
GB,bronze
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,gold
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,silver
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
ROC,bronze
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,gold
Australia,silver
Australia,silver
Australia,silver
Australia,silver
Australia,silver
Australia,silver
Australia,silver
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Australia,bronze
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,gold
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,silver
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
Netherlands,bronze
France,gold
France,gold
France,gold
France,gold
France,gold
France,gold
France,gold
France,gold
France,gold
France,gold
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,silver
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
France,bronze
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,gold
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,silver
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Germany,bronze
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,gold
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,silver
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
Italy,bronze
  • 这更容易实现为堆叠条形图,因此,使用 pandas.crosstab and plot using pandas.DataFrame.plotkind='bar'stacked=True 重塑数据框
    • 这不应该用plt.hist来实现,因为它更复杂,而且直接使用pandas绘图方法更容易。
    • 当 x 值是连续的数字范围而不是离散的分类值时,直方图也更合适。
  • ct.iloc[:, :-1] 选择除最后一列以外的所有列,'tot' 绘制为条形。
  • 使用matplotlib.pyplot.bar_label添加注释
    • ax.bar_label(ax.containers[2], padding=3)默认使用label_type='edge',结果是用累计和标注边('center'用patch值标注),如图.
      • ax.containers[2]中的[2]只选择最前面的容器来标注累计和。 containers 从底部开始索引为 0。
    • 查看此 answer 了解更多详细信息和示例
    • answer 展示了如何使用旧方法进行注释,而无需 .bar_label。不推荐。
    • 展示了如何自定义标签以防止对小于给定大小的值进行注释。
  • 测试于 python 3.10pandas 1.3.5matplotlib 3.5.1

加载并调整 DataFrame

import pandas as pd

# load from github repo link
url = 'https://raw.githubusercontent.com/jpiedehierroa/files/main/Libro1.csv'
df = pd.read_csv(url) 

# reshape the dataframe
ct = pd.crosstab(df.countries, df.type)

# total medals per country, which is necessary to sort the bars
ct['tot'] = ct.sum(axis=1)

# sort
ct = ct.sort_values(by='tot', ascending=False)

# display(ct)
type         bronze  gold  silver  tot
countries                             
USA              33    39      41  113
China            18    38      32   88
ROC              23    20      28   71
GB               22    22      21   65
Japan            17    27      14   58
Australia        22    17       7   46
Italy            20    10      10   40
Germany          16    10      11   37
Netherlands      14    10      12   36
France           11    10      12   33

情节

colors = ("#CD7F32", "silver", "gold")
cd = dict(zip(ct.columns, colors))

# plot the medals columns
title = 'Country Medal Count for Tokyo 2020'
ax = ct.iloc[:, :-1].plot(kind='bar', stacked=True, color=cd, title=title,
                          figsize=(12, 5), rot=0, width=1, ec='k' )

# annotate each container with individual values
for c in ax.containers:
    ax.bar_label(c, label_type='center')
    
# annotate the top containers with the cumulative sum
ax.bar_label(ax.containers[2], padding=3)

# pad the spacing between the number and the edge of the figure
ax.margins(y=0.1)

  • 用总和注释顶部的另一种方法是使用 'tot' 列作为自定义标签,但如图所示,这不是必需的。
labels = ct.tot.tolist()
ax.bar_label(ax.containers[2], labels=labels, padding=3)