将 pandas 数据框旋转到 return 每列两个值
Pivoting pandas dataframe to return two values per column
我有一个 df
这样的:
product_id category cost_centre total_orders price created_at total_sales
868 Phone Google 2 41 2021-04-12 82
86 Phone Facebook 2 30 2021-04-12 60
86 Phone Google 1 30 2021-04-11 30
861 PC Facebook 1 42 2021-04-10 42
862 Tablet Apple 2 4 2021-04-15 8
我是这样旋转它的:
df1 = a.pivot_table(index='cost_centre', columns='category', values='total_sales', aggfunc=sum,
fill_value=0,).add_prefix('total_sales_').rename_axis(columns=None)
哪个returns
total_sales_Phone total_sales_PC total_sales_Tablet
cost_centre
Google 2,948.04 23,041.53 30,973.28
Facebook 3,005.81 11,078.10 3,429.00
Apple 3,873.45 31,725.11 89,072.78
我试图让它看起来像这样(数据不正确):
2021-04-12 2021-04-13 2021-04-14 ...
cost_centre Sales Orders Sales Orders Sales Orders ...
Google 2*41 + 1*30 3 # the first line would be the total of a cost_centre
Phone 2*41 + 1*30 3
PC 0 0
Tablet 0 0
Facebook 2*30 + 1*42 3
Phone 2*30 2
PC 1*30 1
Tablet 0 0
Apple 2*4 2
Phone 0 0
PC 0 0
Tablet 2*4 2
我试过了:
df1 = a.pivot_table(index=['cost_centre','category'], columns='created_at', values='total_sales',
aggfunc=sum, fill_value=0,).add_prefix('total_sales_').rename_axis(columns=None)
returns total_sales
但是当我添加 values = ['total_sales','total_orders']
时它会中断并且 returns
TypeError: Must pass list-like as names
.
你的操作比较复杂。您需要创建两个 pivot_tables 并将它们连接起来:
df_pivot1 = pd.pivot_table(df, index=['created_at', 'cost_centre', 'category'],
values=['total_sales', 'total_orders'],
aggfunc=[np.sum]).unstack(level=0)
df_pivot2 = df_pivot1.groupby('cost_centre').sum()
df2 = (pd.concat([df_pivot1,
pd.concat({'total': df_pivot2},
names=['category']).swaplevel()
])
.fillna(0)
.astype(int)
.reorder_levels([2,1,0], axis=1)
.sort_index(axis=1)
.sort_index(axis=0)
.droplevel(2, axis=1)
)
输出:
created_at 2021-04-10 2021-04-11 2021-04-12 2021-04-15
total_orders total_sales total_orders total_sales total_orders total_sales total_orders total_sales
cost_centre category
Apple Tablet 0 0 0 0 0 0 2 8
total 0 0 0 0 0 0 2 8
Facebook PC 1 42 0 0 0 0 0 0
Phone 0 0 0 0 2 60 0 0
total 1 42 0 0 2 60 0 0
Google Phone 0 0 1 30 2 82 0 0
total 0 0 1 30 2 82 0 0
注意。 cost_centre/categories/etc的顺序。和你的不完全一样,但是这个排序很简单,所以为了清楚起见我没有包括它
我有一个 df
这样的:
product_id category cost_centre total_orders price created_at total_sales
868 Phone Google 2 41 2021-04-12 82
86 Phone Facebook 2 30 2021-04-12 60
86 Phone Google 1 30 2021-04-11 30
861 PC Facebook 1 42 2021-04-10 42
862 Tablet Apple 2 4 2021-04-15 8
我是这样旋转它的:
df1 = a.pivot_table(index='cost_centre', columns='category', values='total_sales', aggfunc=sum,
fill_value=0,).add_prefix('total_sales_').rename_axis(columns=None)
哪个returns
total_sales_Phone total_sales_PC total_sales_Tablet
cost_centre
Google 2,948.04 23,041.53 30,973.28
Facebook 3,005.81 11,078.10 3,429.00
Apple 3,873.45 31,725.11 89,072.78
我试图让它看起来像这样(数据不正确):
2021-04-12 2021-04-13 2021-04-14 ...
cost_centre Sales Orders Sales Orders Sales Orders ...
Google 2*41 + 1*30 3 # the first line would be the total of a cost_centre
Phone 2*41 + 1*30 3
PC 0 0
Tablet 0 0
Facebook 2*30 + 1*42 3
Phone 2*30 2
PC 1*30 1
Tablet 0 0
Apple 2*4 2
Phone 0 0
PC 0 0
Tablet 2*4 2
我试过了:
df1 = a.pivot_table(index=['cost_centre','category'], columns='created_at', values='total_sales',
aggfunc=sum, fill_value=0,).add_prefix('total_sales_').rename_axis(columns=None)
returns total_sales
但是当我添加 values = ['total_sales','total_orders']
时它会中断并且 returns
TypeError: Must pass list-like as
names
.
你的操作比较复杂。您需要创建两个 pivot_tables 并将它们连接起来:
df_pivot1 = pd.pivot_table(df, index=['created_at', 'cost_centre', 'category'],
values=['total_sales', 'total_orders'],
aggfunc=[np.sum]).unstack(level=0)
df_pivot2 = df_pivot1.groupby('cost_centre').sum()
df2 = (pd.concat([df_pivot1,
pd.concat({'total': df_pivot2},
names=['category']).swaplevel()
])
.fillna(0)
.astype(int)
.reorder_levels([2,1,0], axis=1)
.sort_index(axis=1)
.sort_index(axis=0)
.droplevel(2, axis=1)
)
输出:
created_at 2021-04-10 2021-04-11 2021-04-12 2021-04-15
total_orders total_sales total_orders total_sales total_orders total_sales total_orders total_sales
cost_centre category
Apple Tablet 0 0 0 0 0 0 2 8
total 0 0 0 0 0 0 2 8
Facebook PC 1 42 0 0 0 0 0 0
Phone 0 0 0 0 2 60 0 0
total 1 42 0 0 2 60 0 0
Google Phone 0 0 1 30 2 82 0 0
total 0 0 1 30 2 82 0 0
注意。 cost_centre/categories/etc的顺序。和你的不完全一样,但是这个排序很简单,所以为了清楚起见我没有包括它