如何汇总数据并绘制所有组

How to aggregate data and plot all groups

我有以下数据集:

df=pd.read_csv('https://raw.githubusercontent.com/michalis0/DataMining_and_MachineLearning/master/data/sales.csv')

我想可视化按客户类型(Segment)的平均销售额。

我用这个计算了按细分市场的平均销售额:

average_sales = df.groupby(['Segment','OrderYear'],as_index=True)['Sales'].agg({"mean"})
print(average_sales)

我得到以下输出:

                           mean
Segment     OrderYear            
Consumer    2015       251.633302
            2016       238.200804
            2017       223.269145
            2018       200.469005
Corporate   2015       212.641424
            2016       189.902305
            2017       263.348456
            2018       243.634951
Home Office 2015       290.234240
            2016       222.101830
            2017       226.382196
            2018       242.532951

现在我想在折线图中绘制它,年份在 x_axis 上,平均销售额在 y_axis 上,但每次我尝试我只得到一条线 mean 而我想要每个 Segment 一行。 一行代表 'Consumer',一行代表 'Corporate',一行代表 'Home Office'。 我认为这可能是因为 Segment 是一个索引而不是一列,但我仍然无法按段绘制。

  • 最好用 .pivot_table 来塑造日期,这会导致数据框处于正确的形状以进行绘制。
    • 使用已有的groupby作图,需要拆栈:
      • df.groupby(['Segment','OrderYear'], as_index=True)['Sales'].agg({"mean"}).unstack(level=0).plot()
  • 直接用pandas.DataFrame.plot
  • 绘制数据帧pt
  • 此类聚合数据应以分组条形图的形式呈现,但也提供了线图。
    • 线条图越多越难以阅读
    • 斜率暗示的信息可能不完全正确
  • 如果您有兴趣添加注释,请参阅 or
  • 测试于 python 3.8.12pandas 1.3.4matplotlib 3.4.3
import pandas as pd

df=pd.read_csv('https://raw.githubusercontent.com/michalis0/DataMining_and_MachineLearning/master/data/sales.csv')

# convert to datetime
df['Order Date'] = pd.to_datetime(df['Order Date'])

# extract year
df['OrderYear'] = df['Order Date'].dt.year

# pivot table
pt = df.pivot_table(index='OrderYear', columns='Segment', values='Sales', aggfunc='mean')

# display(pt)
Segment      Consumer   Corporate  Home Office
OrderYear                                     
2015       251.633302  212.641424   290.234240
2016       238.629760  196.834867   222.101830
2017       223.269145  264.486862   228.730257
2018       200.368580  243.595111   242.532951

# plot bar
ax1 = pt.plot(kind='bar', rot=0)

# or plot line
ax2 = pt.plot(xticks=pt.index)

示例数据

  • 如果 github 存储库不再可用
Order Date,Segment,Sales
08/04/2018,Home Office,195.76
23/05/2015,Consumer,17.96
01/12/2018,Consumer,406.368
26/03/2016,Consumer,40.032
04/11/2015,Corporate,8.376
14/11/2017,Consumer,48.576
08/05/2017,Consumer,10.368
25/11/2015,Consumer,539.92
17/06/2017,Consumer,319.41
14/01/2015,Corporate,61.96
01/12/2018,Consumer,105.584
24/11/2016,Consumer,91.392
12/07/2015,Consumer,123.136
26/01/2018,Consumer,11.84
01/03/2015,Consumer,137.352
06/12/2016,Consumer,19.92
07/04/2017,Consumer,3.64
17/09/2016,Consumer,4228.704
29/09/2018,Home Office,7.968
17/08/2018,Corporate,2518.29
04/11/2015,Home Office,275.94
20/06/2017,Consumer,17.712
09/09/2018,Consumer,6.56
03/03/2017,Corporate,563.43
11/10/2017,Consumer,27.72
08/12/2018,Consumer,19.44
01/06/2015,Home Office,47.88
28/10/2017,Corporate,756.8
31/07/2016,Consumer,2309.65
08/11/2016,Corporate,4.712
20/10/2016,Corporate,16.02
23/12/2018,Corporate,367.96
15/02/2016,Corporate,134.97
29/12/2015,Consumer,23.976
05/10/2018,Home Office,39.92
25/06/2016,Home Office,31.104
28/10/2017,Consumer,47.952
25/09/2015,Home Office,3.264
18/12/2016,Corporate,18.432
07/09/2018,Consumer,25.16
26/06/2017,Home Office,8.02
16/06/2018,Consumer,18.54
06/12/2016,Consumer,198.272
04/05/2018,Corporate,9.396
23/10/2018,Consumer,10.272
21/02/2017,Corporate,39.98
22/07/2015,Home Office,19.68
29/09/2018,Home Office,27.968
03/08/2015,Consumer,218.75
07/10/2018,Home Office,18.936
18/04/2016,Consumer,115.44
04/04/2016,Consumer,644.076
03/07/2018,Home Office,24.56
10/11/2016,Consumer,577.584
12/05/2018,Consumer,87.4
21/02/2017,Home Office,3.762
18/08/2018,Consumer,21.38
13/07/2016,Consumer,11.808
17/12/2018,Consumer,66.284
02/12/2015,Corporate,58.36
01/12/2015,Consumer,45.84
23/05/2016,Home Office,850.5
14/10/2015,Corporate,22.92
23/10/2018,Corporate,11.56
20/07/2015,Corporate,41.94
16/06/2016,Consumer,133.98
02/09/2015,Consumer,21.24
11/11/2017,Corporate,95.968
03/10/2015,Home Office,6.192
19/11/2018,Consumer,25.06
25/08/2015,Consumer,40.096
29/12/2018,Consumer,34.58
05/12/2018,Consumer,11.07
23/07/2017,Consumer,4.448
05/03/2016,Consumer,11.212
09/06/2015,Consumer,911.424
21/11/2016,Consumer,10.92
13/02/2018,Consumer,10.71
27/04/2016,Consumer,1379.92
30/10/2018,Home Office,33.94
08/08/2016,Consumer,447.86
07/12/2016,Consumer,79.92
21/08/2018,Corporate,33.18
26/01/2015,Home Office,19.44
09/06/2015,Consumer,1706.184
26/09/2016,Consumer,79.056
05/04/2016,Home Office,547.136
27/10/2018,Corporate,5.607
03/07/2016,Consumer,294.93
16/11/2015,Home Office,169.45
08/12/2015,Corporate,60.416
23/11/2016,Consumer,16.56
05/10/2018,Home Office,75.792
19/03/2016,Consumer,17.568
21/08/2017,Corporate,2887.056
25/04/2016,Corporate,21.34
14/05/2017,Corporate,4.768
03/11/2016,Home Office,42.6
21/10/2017,Consumer,22.92
10/07/2018,Corporate,41.91
16/11/2018,Consumer,811.28
17/09/2018,Corporate,10.776
01/12/2018,Home Office,62.958
07/12/2018,Consumer,374.376
19/11/2018,Consumer,821.88
16/06/2018,Consumer,23.92
19/05/2017,Consumer,242.9
06/06/2017,Corporate,105.52
05/12/2015,Corporate,29.94
12/08/2018,Consumer,299.99
08/04/2018,Home Office,41.95
04/10/2015,Consumer,95.648
25/11/2017,Consumer,194.352
18/09/2016,Corporate,11.68
20/12/2016,Home Office,11.696
24/04/2017,Consumer,3.984
14/05/2015,Corporate,310.88
22/09/2015,Consumer,579.528
02/05/2015,Consumer,26.136
19/08/2015,Corporate,69.456
08/07/2018,Corporate,28.91
26/11/2015,Corporate,7.312
24/06/2018,Consumer,21.744
12/11/2018,Consumer,221.024
27/08/2016,Consumer,3.08
18/11/2018,Consumer,127.386
21/11/2016,Corporate,246.1328
12/05/2017,Consumer,120.0
30/12/2017,Home Office,481.32
20/07/2016,Consumer,913.43
23/11/2018,Corporate,10.688
23/04/2015,Home Office,22.336
17/09/2016,Consumer,3.264
20/10/2016,Consumer,24.56
04/06/2017,Consumer,14.94
19/11/2016,Consumer,5.984
30/07/2016,Consumer,209.93
17/09/2016,Consumer,110.96
12/10/2016,Consumer,263.96
02/09/2017,Consumer,65.94
12/10/2016,Consumer,81.96
14/05/2016,Home Office,198.272
09/12/2018,Corporate,37.208
23/05/2017,Consumer,122.382
23/09/2018,Consumer,199.95
28/12/2015,Corporate,704.25
19/01/2018,Consumer,6.0
12/10/2016,Home Office,19.9
14/11/2016,Corporate,37.0
03/10/2018,Home Office,6.63
20/07/2015,Consumer,104.85
10/09/2015,Consumer,1487.04
12/10/2018,Corporate,39.984
23/12/2015,Corporate,56.52
17/11/2016,Consumer,106.32
18/03/2015,Home Office,1856.19
01/09/2016,Home Office,1088.76
05/07/2016,Home Office,19.0
03/11/2015,Home Office,6.72
28/05/2017,Consumer,22.72
13/06/2018,Home Office,164.736
26/09/2016,Consumer,239.8
12/10/2018,Consumer,17.9
02/10/2018,Corporate,21.984
12/11/2018,Home Office,23.12
21/01/2018,Home Office,242.94
09/08/2015,Consumer,2060.744
25/04/2016,Consumer,128.058
04/03/2018,Corporate,15.25
04/08/2017,Home Office,35.06
18/12/2016,Corporate,55.936
19/12/2016,Consumer,675.96
12/07/2016,Consumer,659.168
06/04/2015,Corporate,70.95
19/05/2018,Home Office,681.408
09/07/2016,Consumer,153.36
21/08/2016,Home Office,4.28
22/05/2018,Consumer,22.344
26/08/2015,Consumer,17.34
19/09/2016,Corporate,66.36
06/11/2018,Home Office,449.568
21/11/2017,Consumer,21.568
24/12/2017,Home Office,27.882
09/07/2015,Home Office,23.92
05/08/2016,Corporate,33.488
20/11/2017,Consumer,2.628
07/03/2015,Corporate,481.568
25/11/2017,Consumer,59.98
14/07/2018,Consumer,276.69
03/10/2015,Consumer,14.48
28/07/2017,Home Office,302.72
05/09/2017,Corporate,43.6
16/03/2016,Home Office,17.52
02/09/2017,Home Office,84.272
22/06/2015,Consumer,170.058
08/07/2018,Home Office,86.376
01/11/2016,Home Office,3.168
04/11/2017,Consumer,11.376
18/12/2018,Consumer,46.672
05/12/2017,Consumer,465.18