Pivot_table 列的多索引

Pivot_table MultiIndex to columns

我有以下 table :

In [303]: table.head()
Out[303]: 
            people  weekday  weekofyear
2012-01-01     119        6          52
2012-01-02      76        0           1
2012-01-03      95        1           1
2012-01-04     102        2           1
2012-01-05      87        3           1

我想创建一个简单的 pd.DataFrame 其中:

我尝试使用 pd.pivot_table 得到了预期的结果:

In [308]: p = pd.pivot_table(table, index=["weekday"], columns=["weekofyear"], values=["people"], aggfunc=[np.sum])
     ...: p
     ...: 
Out[308]: 
              sum                                             ...             \
           people                                             ...              
weekofyear     1    2    3    4    5    6    7    8   9    10 ...    43   44   
weekday                                                       ...              
0             162   86   84   95   92   98  108  102  97   87 ...   108   86   
1              95  113   88   78  108  112   98  104  87  105 ...    85   82   
2             102   70   93   82  103   80  103   85  82   96 ...    87  105   
3              87   91  101   83   91  100  100   80  89   86 ...    87   91   
4             111   91  110  103   93  116  110   99  78   77 ...    83  102   
5             117  107   99   88   97   90  100   91  97   88 ...   103  110   
6              92   95   90   86   91  103   98  100  89   96 ...    94  101   



weekofyear   45   46   47   48   49   50   51   52  
weekday                                             
0            99   92   99   83  107  106   93  107  
1           105   83  101   93  102   89  113   84  
2            96   84  110   83  104   84   84  116  
3            87   96   87   88   88   83  113   93  
4            93   81  104  108   72  101  109   97  
5            81  107   97   89   86  108  113  101  
6            93   92   93   91   89   96   93  226  

[7 rows x 52 columns]

but 而不是我的 weekofyears 列,我被无法摆脱的 MultiIndex 困住了。如下图:

In [309]: p.columns
Out[309]: 
MultiIndex(levels=[['sum'], ['people'], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52]],
           labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51]],
           names=[None, None, 'weekofyear']

虽然索引看起来不错:

In [311]: p.index
Out[311]: Int64Index([0, 1, 2, 3, 4, 5, 6], dtype='int64', name='weekday'  

我尝试使用 unstack()reset_index() 函数,但没有成功。

我是不是漏掉了什么?

与其为 valuesaggfunc 提供列表,不如尝试为它们提供单个值。示例 -

p = pd.pivot_table(table, index=["weekday"], columns=["weekofyear"], values="people", aggfunc=np.sum)

演示 -

In [3]: table
Out[3]:
            people  weekday  weekofyear
2012-01-01     119        6          52
2012-01-02      76        0           1
2012-01-03      95        1           1
2012-01-04     102        2           1
2012-01-05      87        3           1

In [12]: p = pd.pivot_table(table, index=["weekday"], columns=["weekofyear"], values="people", aggfunc=np.sum)

In [13]: p
Out[13]:
weekofyear   1    52
weekday
0            76  NaN
1            95  NaN
2           102  NaN
3            87  NaN
6           NaN  119

In [14]: p.columns
Out[14]: Int64Index([1, 52], dtype='int64', name='weekofyear')

来自documentation -

aggfunc : function, default numpy.mean, or list of functions
If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves)

values 的情况类似,但文档中未具体提及