Matplotlib 条形图仅在非零条形图处显示 x 刻度

Question

我必须制作一个（堆叠的）条形图，它在 x 轴上有大约 3000 个位置。然而，其中许多位置不包含条形，但仍标记在 x 轴上，使阅读图变得困难。有没有办法只显示现有（堆叠）条的 x 刻度？基于 x 刻度值的条形之间的空间是必要的。如何在 matplotlib 中解决这个问题？有没有比堆叠条形图更合适的图？我正在从 pandas cross-table (pd.crosstab()).

构建地块

link 到剧情图片： https://i.stack.imgur.com/qk99z.png

作为我的数据框的示例（感谢 gepcel）：

import pandas as pd
import numpy as np
N = 3200
df = pd.DataFrame(np.random.randint(1, 5, size=(N, 3)))
df.loc[np.random.choice(df.index, size=3190, replace=False), :] = 0
df_select = df[df.sum(axis=1)>0]

Answer 1

基本上，在没有示例的情况下，您应该 select 得出总值（也称为堆叠值）大于零的刻度。然后手动设置 xticks 和 xticklabels。

假设您有如下数据框：

import pandas as pd
import numpy as np
N = 3200
df = pd.DataFrame(np.random.randint(1, 5, size=(N, 3)))
df.loc[np.random.choice(df.index, size=3190, replace=False), :] = 0

那么 selected 数据应该是这样的：

df_select = df[df.sum(axis=1)>0]

然后您可以绘制堆积条形图，例如：

# set width=20, the bar is not too thin to show
plt.bar(df_select.index, df_select[0], width=20, label='0')
plt.bar(df_select.index, df_select[1], width=20, label='1',
        bottom=df_select[0])
plt.bar(df_select.index, df_select[2], width=20, label='2',
        bottom=df_select[0]+df_select[1])
# Only show the selected ticks, it'll be a little tricky if
# you want ticklabels to be different than ticks
# And still hard to avoid ticklabels overlapping
plt.xticks(df_select.index)
plt.legend()
plt.show()

结果应该是这样的：

更新:

通过以下方式很容易将文本放在栏的顶部：

for n, row in df_select.iterrows():
    plt.text(n, row.sum()+0.2, n, ha='center', rotation=90, va='bottom')

就是计算出每条柱子顶部的位置，然后在上面放上文字，可能还会加上一些偏移量（比如+0.2），用rotation=90来控制旋转。完整代码为：

df_select = df[df.sum(axis=1)>0]
plt.bar(df_select.index, df_select[0], width=20, label='0')
plt.bar(df_select.index, df_select[1], width=20, label='1',
        bottom=df_select[0])
plt.bar(df_select.index, df_select[2], width=20, label='2',
        bottom=df_select[0]+df_select[1])

# Here is the part to put text:
for n, row in df_select.iterrows():
    plt.text(n, row.sum()+0.2, n, ha='center', rotation=90, va='bottom')

plt.xticks(df_select.index)
plt.legend()
plt.show()

结果：

Answer 2

这是 gepcel 的答案，它适应具有不同列数的数据框：

# in this case I'm creating the dataframe with 3 columns
# but the code is meant to adapt to dataframes with varying column numbers
df = pd.DataFrame(np.random.randint(1, 5, size=(3200, 3)))    
df.loc[np.random.choice(df.index, size=3190, replace=False), :] = 0

df_select = df[df.sum(axis=1)>1]
fig, ax = plt.subplots()

ax.bar(df_select.index, df_select.iloc[:,0], label = df_select.columns[0])

if df_select.shape[1] > 1:
    for i in range(1, df_select.shape[1]):
        bottom = df_select.iloc[:,np.arange(0,i,1)].sum(axis=1)
        ax.bar(df_select.index, df_select.iloc[:,i], bottom=bottom, label = 
df_select.columns[i])

ax.set_xticks(df_select.index)
plt.legend(loc='best', bbox_to_anchor=(1, 0.5))
plt.xticks(rotation=90, fontsize=8)

Matplotlib 条形图仅在非零条形图处显示 x 刻度

Matplotlib bar chart show x-ticks only at non-zero bars

python

plot

matplotlib

figure

pandas