如何将 pandas 数据框的一列转换为 headers 列,并将其余部分融合为长格式?

How can I convert one column of a pandas dataframe into the column headers, and melt the rest into long format?

我有三个格式相似的数据框。每个都来自不同来源数据的 pandas groupby。

df_17 = pd.DataFrame(
     [['Students',550, 75, 325, 100, 2017], ['Staff',10, 3, 7, 6, 2017], ['Teachers',21, 8, 16, 13, 2017]], 
     columns = ['Category', 'Main', 'Pre-K', 'North', 'Downtown', 'Year']).set_index('Category')

df_18 = pd.DataFrame(
     [['Students',565, 70, 321, 2018], ['Staff',11, 3, 6, 2018], ['Teachers',22, 8, 17, 2018]], 
     columns = ['Category', 'Main', 'Pre-K', 'North', 'Year']).set_index('Category')

df_19 = pd.DataFrame(
     [['Students',610, 75, 12, 110, 2019], ['Staff',10, 4, 0, 6, 2019], ['Teachers',24, 9, 1, 16, 2019]], 
     columns = ['Category', 'Main', 'Pre-K', 'Park', 'Downtown', 'Year']).set_index('Category')

df_17
          Main  Pre-K  North  Downtown  Year
Category                                    
Students   550     75    325       100  2017
Staff       10      3      7         6  2017
Teachers    21      8     16        13  2017

df_18 
          Main  Pre-K  North  Year
Category                          
Students   565     70    321  2018
Staff       11      3      6  2018
Teachers    22      8     17  2018

df_19
          Main  Pre-K  Park  Downtown  Year
Category                                   
Students   610     75    12       110  2019
Staff       10      4     0         6  2019
Teachers    24      9     1        16  2019

我想将它们合并到一个长格式的数据框中,每年都有不同的列。像这样。

    Category    Campus 2017 2018 2019
0   Students      Main  550  565  610
1   Students     Pre-K   75   70   75
2   Students     North  325  321  NaN
3   Students  Downtown  100  NaN  110
4   Students      Park  NaN  NaN   12
5      Staff      Main   10   11   10
6      Staff     Pre-K    3    3    4
7      Staff     North    7    6  NaN
8      Staff  Downtown    6  NaN    6
9      Staff      Park  NaN  NaN    0
10  Teachers      Main   21   22   24
11  Teachers     Pre-K    8    8    9
12  Teachers     North   16   17  NaN
13  Teachers  Downtown   13  NaN   16
14  Teachers      Park  NaN  NaN    1

我尝试了合并、熔化、堆叠、取消堆叠、枢轴等的各种组合,但一直无法找出正确的组合。

到目前为止最近的:

df = pd.merge(df_17, df_18, on = ['Category', 'Main', 'Pre-K', 'North', 'Year'], how = 'outer')
df = pd.merge(df, df_19, on = ['Category', 'Main', 'Pre-K', 'Downtown'], how = 'outer')
df = df.stack()

Category          
Students  Main         550.0
          Pre-K         75.0
          North        325.0
          Downtown     100.0
          Year_x      2017.0
Staff     Main          10.0
          Pre-K          3.0
          North          7.0
          Downtown       6.0
          Year_x      2017.0
Teachers  Main          21.0
          Pre-K          8.0
          North         16.0
          Downtown      13.0
          Year_x      2017.0
Students  Main         565.0
          Pre-K         70.0
          North        321.0
          Year_x      2018.0
Staff     Main          11.0
          Pre-K          3.0
          North          6.0
          Year_x      2018.0
Teachers  Main          22.0
          Pre-K          8.0
          North         17.0
          Year_x      2018.0
Students  Main         610.0
          Pre-K         75.0
          Downtown     110.0
          Park          12.0
          Year_y      2019.0
Staff     Main          10.0
          Pre-K          4.0
          Downtown       6.0
          Park           0.0
          Year_y      2019.0
Teachers  Main          24.0
          Pre-K          9.0
          Downtown      16.0
          Park           1.0
          Year_y      2019.0
dtype: float64

我错过了什么?

您可以 pd.concat 数据帧,.melt 它们并使用 .pivot_table

转换为所需的格式
df = pd.concat([df_17,df_18,df_19]).reset_index()
df = pd.melt(df, id_vars=['Category', 'Year'], var_name = 'Campus') \
       .pivot_table(index=['Category', 'Campus'], columns='Year', values='value') \
       .reset_index()
df.columns.name = None #This just cleans up the index name
df

输出:

     Category   Campus      2017    2018    2019
0    Staff      Downtown    6.0     NaN     6.0
1    Staff      Main        10.0    11.0    10.0
2    Staff      North       7.0     6.0     NaN
3    Staff      Park        NaN     NaN     0.0
4    Staff      Pre-K       3.0     3.0     4.0
5    Students   Downtown    100.0   NaN     110.0
6    Students   Main        550.0   565.0   610.0
7    Students   North       325.0   321.0   NaN
8    Students   Park        NaN     NaN     12.0
9    Students   Pre-K       75.0    70.0    75.0
10   Teachers   Downtown    13.0    NaN     16.0
11   Teachers   Main        21.0    22.0    24.0
12   Teachers   North       16.0    17.0    NaN
13   Teachers   Park        NaN     NaN     1.0
14   Teachers   Pre-K       8.0     8.0     9.0