在 Python pandas 中对齐折线图和条形图

Aligning line and bar charts in Python pandas

我正在尝试使用 Python 的 pandas 库在条形图之上绘制折线图。条形图是来自 DataFrame 的聚类图表,单独绘制时看起来像这样:

Clustered Bar Chart

折线图是整个数据集的平均值:

Line Chart

如果使用所有折线图,我可以在同一个轴对象上成功绘制两组数据:

All Line Charts

但是当我将聚类条形图的 DataFrame 切换为实际使用条形图并将其与折线图一起绘制时,折线图希望从第二个索引位置开始绘制,从而导致偏移。

Bar and Line With Offset

的回答对 matplotlib 处理条形图和折线图的 x 轴的方式有一个有趣的评论,这可能与此相关,但我不知道该怎么做这种洞察力。

解决此对齐问题的好方法是什么?

重现代码:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

table_data = {2019: {1: np.nan,
2: np.nan,
3: np.nan,
  4: 1.4200000000000002,
  5: 0.8193548387096775,
  6: 1.420689655172414,
  7: 0.4645161290322581,
  8: 0.10322580645161289,
  9: 0.29333333333333333,
  10: 1.7741935483870968,
  11: 0.32,
  12: 6.703225806451614},
 2020: {1: 5.52,
  2: 12.613793103448277,
  3: 0.9428571428571428,
  4: 0.1793103448275862,
  5: 0.39354838709677414,
  6: 1.3866666666666667,
  7: 1.9800000000000002,
  8: 0.6689655172413793,
  9: 0.19333333333333336,
  10: 5.896774193548388,
  11: 0.6896551724137931,
  12: 4.103225806451613},
 2021: {1: 2.7935483870967746,
  2: 5.15,
  3: 9.696774193548388,
  4: 3.74,
  5: 2.8967741935483873,
  6: 0.9103448275862069,
  7: 1.6516129032258065,
  8: 0.3,
  9: 0.38571428571428573,
  10: 5.141935483870968,
  11: 8.58,
  12: 6.052173913043479},
 2022: {1: 2.3923076923076922,
  2: 31.678571428571427,
  3: 8.761290322580646,
  4: np.nan,
  5: np.nan,
  6: np.nan,
  7: np.nan,
  8: np.nan,
  9: np.nan,
  10: np.nan,
  11: np.nan,
  12: np.nan}}

means = {1: 3.6137931034482755,
 2: 16.435294117647057,
 3: 7.132530120481928,
 4: 1.797752808988764,
 5: 1.3698924731182796,
 6: 1.240909090909091,
 7: 1.358695652173913,
 8: 0.3522727272727273,
 9: 0.28863636363636364,
 10: 4.2709677419354835,
 11: 3.2247191011235956,
 12: 5.578823529411765}

df_bars = pd.DataFrame(table_data)
df_means = pd.DataFrame.from_dict(means, orient = 'index', columns = ['Mean'])

# Clustered bar chart by itself
df_bars.plot(kind = 'bar',
           title = 'Average Daily Rainfall by Month',
           ylabel = 'Average Daily Rainfall (mm)',
           figsize = (10, 6)
          )

# Line chart by itself
df_means.plot(
    kind = 'line',
    title = 'Average Daily Rainfall by Month',
    ylabel = 'Average Daily Rainfall (mm)',
    y = 'Mean'
)

# Show all data as line charts. This works OK
ax_avg = df_bars.plot(kind = 'line',
           title = 'Average Daily Rainfall by Month',
           ylabel = 'Average Daily Rainfall (mm)',
           figsize = (10, 6)
          )

df_means.plot(
    ax = ax_avg,
    kind = 'line',
    y = 'Mean'
)
plt.show()

# Show bar data and line chart on the one plot. The line chart is offset!
ax_avg2 = df_bars.plot(kind = 'bar',
           title = 'Average Daily Rainfall by Month',
           ylabel = 'Average Daily Rainfall (mm)',
           figsize = (10, 6)
          )

df_means.plot(
    ax = ax_avg2,
    kind = 'line',
    y = 'Mean'
)
plt.show()

您可以使用 reset_index 将行数据帧的索引更改回从零开始。

这将使您的条形图与 zero-based 索引对齐,如下所示:

ax_avg2 = df_bars.plot(kind = 'bar',
           title = 'Average Daily Rainfall by Month',
           ylabel = 'Average Daily Rainfall (mm)',
           figsize = (10, 6)
          )

df_means.reset_index().plot(
    ax = ax_avg2,
    kind = 'line',
    y = 'Mean'
)

输出: