将 pandas 数据框旋转到 return 每列两个值

Pivoting pandas dataframe to return two values per column

我有一个 df 这样的:

product_id      category    cost_centre     total_orders    price           created_at          total_sales
868             Phone       Google          2               41              2021-04-12          82
86              Phone       Facebook        2               30              2021-04-12          60
86              Phone       Google          1               30              2021-04-11          30
861             PC          Facebook        1               42              2021-04-10          42
862             Tablet      Apple           2               4               2021-04-15          8            

我是这样旋转它的:

df1 = a.pivot_table(index='cost_centre', columns='category', values='total_sales', aggfunc=sum,
                    fill_value=0,).add_prefix('total_sales_').rename_axis(columns=None)

哪个returns

                    total_sales_Phone        total_sales_PC       total_sales_Tablet
cost_centre                             
Google              2,948.04                 23,041.53            30,973.28
Facebook            3,005.81                 11,078.10            3,429.00
Apple               3,873.45                 31,725.11            89,072.78  

我试图让它看起来像这样(数据不正确):

                    2021-04-12                      2021-04-13          2021-04-14       ...

cost_centre         Sales             Orders        Sales   Orders      Sales   Orders   ...                 
Google              2*41 + 1*30       3             # the first line would be the total of a cost_centre
     Phone          2*41 + 1*30       3
     PC             0                 0
     Tablet         0                 0
Facebook            2*30  + 1*42      3
     Phone          2*30              2
     PC             1*30              1
     Tablet         0                 0
Apple               2*4               2 
     Phone          0                 0
     PC             0                 0
     Tablet         2*4               2

我试过了:

df1 = a.pivot_table(index=['cost_centre','category'], columns='created_at', values='total_sales', 
                    aggfunc=sum, fill_value=0,).add_prefix('total_sales_').rename_axis(columns=None)

returns total_sales 但是当我添加 values = ['total_sales','total_orders'] 时它会中断并且 returns

TypeError: Must pass list-like as names.

你的操作比较复杂。您需要创建两个 pivot_tables 并将它们连接起来:

df_pivot1 = pd.pivot_table(df, index=['created_at', 'cost_centre', 'category'],
                           values=['total_sales', 'total_orders'],
                           aggfunc=[np.sum]).unstack(level=0)
df_pivot2 = df_pivot1.groupby('cost_centre').sum()

df2 = (pd.concat([df_pivot1,
                  pd.concat({'total':  df_pivot2},
                            names=['category']).swaplevel()
                  ])
         .fillna(0)
         .astype(int)
         .reorder_levels([2,1,0], axis=1)
         .sort_index(axis=1)
         .sort_index(axis=0)
         .droplevel(2, axis=1)
      )

输出:

created_at             2021-04-10               2021-04-11               2021-04-12               2021-04-15            
                     total_orders total_sales total_orders total_sales total_orders total_sales total_orders total_sales
cost_centre category                                                                                                    
Apple       Tablet              0           0            0           0            0           0            2           8
            total               0           0            0           0            0           0            2           8
Facebook    PC                  1          42            0           0            0           0            0           0
            Phone               0           0            0           0            2          60            0           0
            total               1          42            0           0            2          60            0           0
Google      Phone               0           0            1          30            2          82            0           0
            total               0           0            1          30            2          82            0           0

注意。 cost_centre/categories/etc的顺序。和你的不完全一样,但是这个排序很简单,所以为了清楚起见我没有包括它