如何创建按月分组的年度条形图

How to create a yearly bar plot grouped by months

我在尝试创建条形图时遇到了困难,DataFrame 按年份和月份分组。使用以下代码,我试图在创建的图像中绘制数据,而不是返回第二张图像。我还尝试将图例向右移动并将其值更改为相应的月份。

我开始对使用 groupby 命令获得的 DataFrame 有了一些感觉,虽然没有得到我期望的结果让我问你们。

import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns

df = pd.read_csv('fcc-forum-pageviews.csv', index_col='date')
line_plot = df.value[(df.value > df.value.quantile(0.025)) & (df.value < df.value.quantile(0.975))]
fig, ax = plt.subplots(figsize=(10,10))
bar_plot = line_plot.groupby([line_plot.index.year, line_plot.index.month]).mean().unstack()
bar_plot.plot(kind='bar')
ax.set_xlabel('Years')
ax.set_ylabel('Average Page Views')
plt.show()

这是我正在分析的数据格式。

date,value
2016-05-09,1201
2016-05-10,2329
2016-05-11,1716
2016-05-12,10539
2016-05-13,6933

只需将您定义的 ax 传递给 pandas:

bar_plot.plot(ax = ax, kind='bar')

如果您还想用名称替换月份数字,则必须获取这些标签,用名称替换数字并通过将新标签传递给它来重新定义图例:

handles, labels = ax.get_legend_handles_labels()
new_labels = [datetime.date(1900, int(monthinteger), 1).strftime('%B') for monthinteger in labels]
ax.legend(handles = handles, labels = new_labels, loc = 'upper left', bbox_to_anchor = (1.02, 1))

完整代码

import pandas as pd
from matplotlib import pyplot as plt
import datetime

df = pd.read_csv('fcc-forum-pageviews.csv')
df['date'] = pd.to_datetime(df['date'])
df = df.set_index('date')
line_plot = df.value[(df.value > df.value.quantile(0.025)) & (df.value < df.value.quantile(0.975))]

fig, ax = plt.subplots(figsize=(10,10))
bar_plot = line_plot.groupby([line_plot.index.year, line_plot.index.month]).mean().unstack()
bar_plot.plot(ax = ax, kind='bar')
ax.set_xlabel('Years')
ax.set_ylabel('Average Page Views')

handles, labels = ax.get_legend_handles_labels()
new_labels = [datetime.date(1900, int(monthinteger), 1).strftime('%B') for monthinteger in labels]
ax.legend(handles = handles, labels = new_labels, loc = 'upper left', bbox_to_anchor = (1.02, 1))

plt.show()

(用假数据生成的图)

  1. 添加 pd.Categorical
  2. 的排序分类 'month'
  3. 使用 pd.pivot_table 将数据帧转换为宽格式,其中 aggfunc='mean' 是默认值。
    • 宽格式通常最适合绘制分组条形图。
  4. pandas.DataFrame.plot returns matplotlib.axes.Axes,所以没必要用fig, ax = plt.subplots(figsize=(10,10)).
  5. pandas .dt accessor用于提取'date'的各种成分,必须是datetime dtype
    • 如果'date'不是datetime dtype,则用df.date = pd.to_datetime(df.date)转换。
  6. 使用 python 3.8.11pandas 1.3.1matplotlib 3.4.2
  7. 进行了测试

导入和测试数据

import pandas as pd
from calendar import month_name  # conveniently supplies a list of sorted month names or you can type them out manually
import numpy as np  # for test data

# test data and dataframe
np.random.seed(365)
rows = 365 * 3
data = {'date': pd.bdate_range('2021-01-01', freq='D', periods=rows), 'value': np.random.randint(100, 1001, size=(rows))}
df = pd.DataFrame(data)

# select data within specified quantiles
df = df[df.value.gt(df.value.quantile(0.025)) & df.value.lt(df.value.quantile(0.975))]

# display(df.head())
        date  value
0 2021-01-01    694
1 2021-01-02    792
2 2021-01-03    901
3 2021-01-04    959
4 2021-01-05    528

变换和绘图

  • 如果 'date' 已设置为索引,如评论中所述,请使用以下内容:
    • df['months'] = pd.Categorical(df.index.strftime('%B'), categories=months, ordered=True)
# create the month column
months = month_name[1:]
df['months'] = pd.Categorical(df.date.dt.strftime('%B'), categories=months, ordered=True)

# pivot the dataframe into the correct shape
dfp = pd.pivot_table(data=df, index=df.date.dt.year, columns='months', values='value')

# display(dfp.head())
months  January  February  March  April    May   June   July  August  September  October  November  December
date                                                                                                        
2021      637.9     595.7  569.8  508.3  589.4  557.7  508.2   545.7      560.3    526.2     577.1     546.8
2022      567.9     521.5  625.5  469.8  582.6  627.3  630.4   474.0      544.1    609.6     526.6     572.1
2023      521.1     548.5  484.0  528.2  473.3  547.7  525.3   522.4      424.7    561.3     513.9     602.3

# plot
ax = dfp.plot(kind='bar', figsize=(12, 4), ylabel='Mean Page Views', xlabel='Year', rot=0)
_ = ax.legend(bbox_to_anchor=(1, 1.02), loc='upper left')