这个堆积条形图出了什么问题?

What is going wrong with this stacked bar plot?

我真的不明白这是怎么回事...我已经多次查看了非常简单的数据并重新启动了内核(运行 在 Jupyter Notebook 上)但似乎没有任何问题正在解决它。

这是我的数据框(抱歉,我知道这些数字看起来有点傻,这是一个很长一段时间内非常稀疏的数据集,原始数据重新索引了 20 年):

DATE        NODP            NVP             VP              VDP
03/08/2002  0.083623        0.10400659      0.81235517      1.52458E-05
14/09/2003  0.24669167      0.24806379      0.5052293       1.52458E-05
26/07/2005  0.15553726      0.13324796      0.7111538       0.000060983
20/05/2006  0               0.23            0.315           0.455
05/06/2007  0.21280034      0.29139224      0.49579217      1.52458E-05
21/02/2010  0               0.55502195      0.4449628       1.52458E-05
09/04/2011  0.09531311      0.17514162      0.72954527      0
14/02/2012  0.19213217      0.12866237      0.67920546      0
17/01/2014  0.12438848      0.10297326      0.77263826      0
24/02/2017  0.01541347      0.09897548      0.88561105      0

请注意,所有行加起来为 1!我已经三重、四重检查了这个...XD

我正在尝试使用以下代码生成此数据的堆叠条形图,它似乎对我一直使用它的其他所有内容都非常有效:

NODP = df['NODP']
NVP = df['NVP']
VDP = df['VDP']
VP = df['VP']
ind = np.arange(len(df.index))
width = 5.0

p1 = plt.bar(ind, NODP, width, label = 'NODP', bottom=NVP, color= 'grey')
p2 = plt.bar(ind, NVP, width, label = 'NVP', bottom=VDP, color= 'tan')
p3 = plt.bar(ind, VDP, width, label = 'VDP', bottom=VP, color= 'darkorange')
p4 = plt.bar(ind, VP, width, label = 'VP', color= 'darkgreen')
plt.ylabel('Ratio')
plt.xlabel('Year')
plt.title('Ratio change',x=0.06,y=0.8)
plt.xticks(np.arange(min(ind), max(ind)+1, 6.0), labels=xlabels) #the xticks were cumbersome so not included in this example code
plt.legend()

这给了我以下情节:

很明显,1) NODP 根本没有出现,2) 其余部分的绘制比例错误...

我真的不明白哪里不对,应该很简单吧?!对不起,如果它真的很简单,它可能就在我眼皮底下。非常感谢任何想法!

如果您想以这种方式创建堆叠条形图(因此不使用 pandas 或 seaborn 进行绘图的标准 matplotlib),底部需要是所有较低条形图的总和。

这是给定数据的示例。

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd

columns = ['DATE', 'NODP', 'NVP', 'VP', 'VDP']
data = [['03/08/2002', 0.083623, 0.10400659, 0.81235517, 1.52458E-05],
        ['14/09/2003', 0.24669167, 0.24806379, 0.5052293, 1.52458E-05],
        ['26/07/2005', 0.15553726, 0.13324796, 0.7111538, 0.000060983],
        ['20/05/2006', 0, 0.23, 0.315, 0.455],
        ['05/06/2007', 0.21280034, 0.29139224, 0.49579217, 1.52458E-05],
        ['21/02/2010', 0, 0.55502195, 0.4449628, 1.52458E-05],
        ['09/04/2011', 0.09531311, 0.17514162, 0.72954527, 0],
        ['14/02/2012', 0.19213217, 0.12866237, 0.67920546, 0],
        ['17/01/2014', 0.12438848, 0.10297326, 0.77263826, 0],
        ['24/02/2017', 0.01541347, 0.09897548, 0.88561105, 0]]
df = pd.DataFrame(data=data, columns=columns)
ind = pd.to_datetime(df.DATE)
NODP = df.NODP.to_numpy()
NVP = df.NVP.to_numpy()
VP = df.VP.to_numpy()
VDP = df.VDP.to_numpy()

width = 120
p1 = plt.bar(ind, NODP, width, label='NODP', bottom=NVP+VDP+VP, color='grey')
p2 = plt.bar(ind, NVP, width, label='NVP', bottom=VDP+VP, color='tan')
p3 = plt.bar(ind, VDP, width, label='VDP', bottom=VP, color='darkorange')
p4 = plt.bar(ind, VP, width, label='VP', color='darkgreen')
plt.ylabel('Ratio')
plt.xlabel('Year')
plt.title('Ratio change')
plt.yticks(np.arange(0, 1.001, 0.1))
plt.legend()
plt.show()

请注意,在这种情况下,x 轴以天为单位,并且每个条都位于其日期处。如果这很重要,这有助于了解日期之间的相对间隔。如果不重要,可以选择等距的 x 位置并通过日期列标记。

要使用标准 matplotlib 执行此操作,将更改以下代码:

ind = range(len(df))
width = 0.8
plt.xticks(ind, df.DATE, rotation=20)
plt.tight_layout() # needed to show the full labels of the x-axis

绘制数据帧

# using your data above
df.DATE = pd.to_datetime(df.DATE)
df.set_index('DATE', inplace=True)

ax = df.plot(stacked=True, kind='bar', figsize=(12, 8))
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0.)

# sets the tick labels so time isn't included
ax.xaxis.set_major_formatter(plt.FixedFormatter(df.index.to_series().dt.strftime("%Y-%m-%d")))

plt.show()

为清楚起见添加标签

  • 通过在 plt.show() 之前添加以下代码,您可以向条形图添加文本注释
# .patches is everything inside of the chart
for rect in ax.patches:
    # Find where everything is located
    height = rect.get_height()
    width = rect.get_width()
    x = rect.get_x()
    y = rect.get_y()

    # The width of the bar is the data value and can used as the label
    label_text = f'{height:.2f}'  # f'{height:.2f}' if you have decimal values as labels

    label_x = x + width - 0.125
    label_y = y + height / 2

    # don't include label if it's equivalently 0
    if height > 0.001:
        ax.text(label_x, label_y, label_text, ha='right', va='center', fontsize=8)

plt.show()