如何注释堆栈图或面积图

How to annotate a stackplot or area plot

我试图用这些值绘制面积图。

y1=[26.8,24.97,25.69,24.07]
y2=[21.74,19.58,20.7,21.09]
y3=[13.1,12.45,12.75,10.79]
y4=[9.38,8.18,8.79,6.75]
y5=[12.1,10.13,10.76,8.03]
y6=[4.33,3.73,3.78,3.75]

df = pd.DataFrame([y1,y2,y3,y4,y5,y6])

cumsum = df.cumsum()
cumsum

我可以做面积部分,但是我不知道如何在图中添加具体数字。

labels = ["Medical", "Surgical", "Physician Services", "Newborn", "Maternity", "Mental Health"]
x = [1,2,3,4]
years = [2011,2012,2013,2014]

fig, ax = plt.subplots()
plt.title("Overall, inpatient costs have decreased in 2011")
ax.stackplot(x, y1,y2,y3,y4,y5,y6, labels=labels, colors = sns.color_palette("Blues")[::-1])
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.spines['left'].set_visible(False)

plt.legend(bbox_to_anchor=(1.05, 1), loc="upper left")
display()

这是当前输出,但与所需输出不匹配

输出应该是这样的。

您可以在代码末尾添加以下代码段:

for i, c in df.iteritems():
    v2 = 0
    for v in c:
        v2 += v
        ax.text(i+1, v2, f'${v:.2f}')

输出:

我在你的代码中更改了这些行:

fig, ax = plt.subplots(figsize=(10,7))
ax.stackplot(years, y1,y2,y3,y4,y5,y6, labels=labels, colors = sns.color_palette("Blues")[::-1])
plt.legend(bbox_to_anchor=(1.1, 1), loc="upper left")

然后添加这些行并得到你想要的:

df2 = df.cumsum()

for id_col, col in df2.iteritems():
    prev_val = 0
    for val in col:
        ax.annotate(text='${}'.format(round((val - prev_val),2)), xy=(years[id_col],(val)), weight='bold')        
        prev_val = val

plt.xticks(years)

输出:

完整代码:

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

y1=[26.8,24.97,25.69,24.07]
y2=[21.74,19.58,20.7,21.09]
y3=[13.1,12.45,12.75,10.79]
y4=[9.38,8.18,8.79,6.75]
y5=[12.1,10.13,10.76,8.03]
y6=[4.33,3.73,3.78,3.75]
labels = ["Medical", "Surgical", "Physician Services", 
          "Newborn", "Maternity", "Mental Health"]
years = [2011,2012,2013,2014]
fig, ax = plt.subplots(figsize=(10,7))
plt.title("Overall, inpatient costs have decreased in 2011", weight='bold')
ax.spines['right'].set_visible(False);ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False);ax.spines['left'].set_visible(False)
ax.stackplot(years, y1,y2,y3,y4,y5,y6, labels=labels, 
             colors = sns.color_palette("Blues")[::-1])

df2 = pd.DataFrame([y1,y2,y3,y4,y5,y6]).cumsum()
for id_col, col in df2.iteritems():
    prev_val = 0
    for val in col:
        # Base Matplotlib version use `text` or `s`
        # ax.annotate(text='${}'.format(round((val - prev_val),2)),  xy=(years[id_col],(val)) , weight='bold')   
        ax.annotate(s='${}'.format(round((val - prev_val),2)),  xy=(years[id_col],(val)) , weight='bold')        

        prev_val = val

plt.xticks(years)
plt.xlabel('Year')
plt.ylabel('Cost (USD)')
plt.legend(bbox_to_anchor=(1.1, 1), loc="upper left")
plt.show()
  • 因为已经有DataFrame,所以使用pandas.DataFrame.plot and kind='area'
    • 但是,DataFrame需要按如下所示构造。
  • 问题与非常相似。
  • 为了正确放置注释,每个 x 刻度的值的累积和必须用作 y 位置。可以使用 .annotate or .text 进行注释
    • ax.annotation(text=f'${a:0.2f}', xy=(x, cs[i]))
    • ax.text(x=x, y=cs[i], s=f'${a:0.2f}')
  • 测试于 python 3.8.11pandas 1.3.3matplotlib 3.4.3
import pandas as pd

# create the DataFrame
values = [y1, y2, y3, y4, y5, y6]
labels = ["Medical", "Surgical", "Physician Services", "Newborn", "Maternity", "Mental Health"]
years = [2011, 2012, 2013, 2014]
data = dict(zip(labels, values))
df = pd.DataFrame(data=data, index=years)

# display(df)
      Medical  Surgical  Physician Services  Newborn  Maternity  Mental Health
2011    26.80     21.74               13.10     9.38      12.10           4.33
2012    24.97     19.58               12.45     8.18      10.13           3.73
2013    25.69     20.70               12.75     8.79      10.76           3.78
2014    24.07     21.09               10.79     6.75       8.03           3.75

# plot
ax = df.plot(kind='area', xticks=df.index, title='Overall, inpatient costs have decreased in 2011',
             color=sns.color_palette("Blues")[::-1], figsize=(10, 6), ylabel='Cost (USD)')
ax.legend(bbox_to_anchor=(1.07, 1.02), loc='upper left')  # move the legend
ax.set_frame_on(False)  # remove all the spines
ax.tick_params(left=False)  # remove the y tick marks
ax.set_yticklabels([])  # remove the y labels
ax.margins(x=0, y=0)  # remove the margin spacing

# annotate
for x, v in df.iterrows():
    cs = v.cumsum()[::-1]  # get the cumulative sum of the row and reverse it to provide the correct y position
    for i, a in enumerate(v[::-1]):  # reverse the row values for the correct annotation
        ax.annotate(text=f'${a:0.2f}', xy=(x, cs[i]))

  • 我认为堆积条形图可以更清晰地呈现数据,因为数据是离散的,而不是连续的。面积图中的线条表示连续数据集。
    • 看到这个answer for thorough details about using .bar_label
ax = df.plot(kind='bar', stacked=True, color=sns.color_palette("Blues")[::-1], rot=0,
             title='Overall, inpatient costs have decreased in 2011', ylabel='Cost (USD)', figsize=(10, 6))
ax.legend(bbox_to_anchor=(1, 0.5), loc='center left', frameon=False)
ax.set_frame_on(False)  # remove all the spines
ax.tick_params(left=False, bottom=False)  # remove the x and y tick marks
ax.set_yticklabels([])  # remove the y labels

for c in ax.containers:
    
    # customize the label to account for cases when there might not be a bar section
#     labels = [f'${h:0.2f}' if (h := v.get_height()) > 0 else '' for v in c ]  # use this line with python >= 3.8
    labels = [f'${v.get_height():0.2f}' if v.get_height() > 0 else '' for v in c ]
    
    # set the bar label
    ax.bar_label(c, labels=labels, label_type='center', fontsize=8)