Pandas 使用多级索引旋转 Table

Pandas Pivot Table with multilevel index

我有一个包含商品及其年销售额的 df。我想将其更改为 pivot table 但具有两个级别的索引。

我的东风:

date    brand_id    brand_name  art_id  art_name    count_art
2015    1           cat         10      A           120
2016    1           cat         10      A           100
2017    1           cat         12      B           80
2015    2           dog         20      C           100
2016    2           dog         25      D           110
2015    3           bird        30      E           50
2017    3           bird        31      F           90

我想要的结果是这样的:

                                2015                            2016                            2017            
brand_id    brand_name  art_id  art_name    count_art   art_id  art_name    count_art   art_id  art_name    count_art
1           cat         10      A           120         10      A           100         12      B           80      
2           dog         20      C           100         25      D           110         null    null        null    
3           bird        30      E           50          null    null        null        31      F           90  

现在我尝试了以下命令:

transformed_data = df.pivot_table(values=['art_id', 'art_name', 'count_art'], index=['brand_id', 'brand_name'], columns='date', aggfunc='first')

但是它没有按预期工作。我知道如何将行更改为年度列,但是我不知道如何将多行中的多列更改为包含更多列的一行。

IIUC,使用 pivot_table command to include the values in the desired order. Then, use swaplevel to reorder your levels, and sort_indexsort_remaining=False 以确保只对日期进行排序:

new_cols = ['art_id', 'art_name', 'count_art']
transformed_data = (
 df.pivot_table(values=new_cols,
               index=['brand_id', 'brand_name'],
               columns=['date'], aggfunc='first')
   [new_cols]
   .swaplevel(axis=1)
   .sort_index(level=0, axis=1, sort_remaining=False)
)

输出:

date                  2015                      2016                      2017                   
                    art_id art_name count_art art_id art_name count_art art_id art_name count_art
brand_id brand_name                                                                              
1        cat          10.0        A     120.0   10.0        A     100.0   12.0        B      80.0
2        dog          20.0        C     100.0   25.0        D     110.0    NaN      NaN       NaN
3        bird         30.0        E      50.0    NaN      NaN       NaN   31.0        F      90.0

添加DataFrame.swaplevel with DataFrame.sort_index:

df = (df.pivot_table(values=['art_id', 'art_name', 'count_art'], 
                    index=['brand_id', 'brand_name'], 
                    columns='date', 
                    aggfunc='first')
        .swaplevel(1, 0, axis=1)
        .sort_index(level=0, axis=1, sort_remaining=False))
print (df)
date                  2015                      2016                     \
                    art_id art_name count_art art_id art_name count_art   
brand_id brand_name                                                       
1        cat          10.0        A     120.0   10.0        A     100.0   
2        dog          20.0        C     100.0   25.0        D     110.0   
3        bird         30.0        E      50.0    NaN      NaN       NaN   

date                  2017                     
                    art_id art_name count_art  
brand_id brand_name                            
1        cat          12.0        B      80.0  
2        dog           NaN      NaN       NaN  
3        bird         31.0        F      90.0