为总收入和月收入的总和创建列。使用分组依据
Create column for sum of total revenue as well monthly revenue. using group by
我正在尝试创建包含每月总计的列以及每个 VIN 的总计列。请帮助获取输出数据框..
Final_Data = pd.DataFrame(Final_Data.groupby(by='VIN'
as_index=False, ,'Month')['Dealers_Revenue'].sum())
当前数据帧
Final_Data:
VIN
Revenue
Category
Month
v1
30
MKL
64
v1
50
GKL
64
v1
40
GKL
64
v1
30
UKL
63
v1
40
MKL
63
v2
30
MKL
63
v2
50
GKL
63
v2
40
GKL
62
v2
30
UKL
62
v2
40
MKL
61
数据框中我想要的输出
Final_Data:
# create DataFrame
df = pd.DataFrame({
'VIN':['v1', 'v1', 'v1', 'v1', 'v1', 'v2', 'v2', 'v2', 'v2', 'v2'],
'Revenue':[30, 50, 40, 30, 40, 30, 50, 40, 30, 50],
'Category':['MKL', 'GKL','GKL', 'UKL', 'MKL', 'MKL', 'GKL', 'GKL', 'UKL', 'MKL'],
'Month':[64, 64, 64, 63, 63, 63, 63, 62, 62, 61]
})
print(df)
VIN Revenue Category Month
0 v1 30 MKL 64
1 v1 50 GKL 64
2 v1 40 GKL 64
3 v1 30 UKL 63
4 v1 40 MKL 63
5 v2 30 MKL 63
6 v2 50 GKL 63
7 v2 40 GKL 62
8 v2 30 UKL 62
9 v2 50 MKL 61
# revenue sum by VIN and Month
df_group = df.groupby(['VIN','Month']).agg(list).reset_index()
print(df_group)
VIN Month Revenue Category
0 v1 63 [30, 40] [UKL, MKL]
1 v1 64 [30, 50, 40] [MKL, GKL, GKL]
2 v2 61 [50] [MKL]
3 v2 62 [40, 30] [GKL, UKL]
4 v2 63 [30, 50] [MKL, GKL]
# calculate total revenue by VIN
df_tot = df_group.groupby(['VIN'])['Revenue'].sum().reset_index()
print(df_tot)
VIN Revenue
0 v1 [30, 40, 30, 50, 40]
1 v2 [50, 40, 30, 30, 50]
# merge df_group with df_tot and rename columns
df_merge = pd.merge(df_group, df_tot, on='VIN').rename(columns={'Revenue_x': 'Total Revenue by Month', 'Revenue_y': 'Total Revenue by VIN'})
# sum lists
df_merge['Total Revenue by Month'] = df_merge['Total Revenue by Month'].apply(sum)
df_merge['Total Revenue by VIN'] = df_merge['Total Revenue by VIN'].apply(sum)
print(df_merge)
VIN Month Total Revenue by Month Category Total Revenue by VIN
0 v1 63 70 [UKL, MKL] 190
1 v1 64 120 [MKL, GKL, GKL] 190
2 v2 61 50 [MKL] 200
3 v2 62 70 [GKL, UKL] 200
4 v2 63 80 [MKL, GKL] 200
我正在尝试创建包含每月总计的列以及每个 VIN 的总计列。请帮助获取输出数据框..
Final_Data = pd.DataFrame(Final_Data.groupby(by='VIN'
as_index=False, ,'Month')['Dealers_Revenue'].sum())
当前数据帧 Final_Data:
VIN | Revenue | Category | Month |
---|---|---|---|
v1 | 30 | MKL | 64 |
v1 | 50 | GKL | 64 |
v1 | 40 | GKL | 64 |
v1 | 30 | UKL | 63 |
v1 | 40 | MKL | 63 |
v2 | 30 | MKL | 63 |
v2 | 50 | GKL | 63 |
v2 | 40 | GKL | 62 |
v2 | 30 | UKL | 62 |
v2 | 40 | MKL | 61 |
数据框中我想要的输出 Final_Data:
# create DataFrame
df = pd.DataFrame({
'VIN':['v1', 'v1', 'v1', 'v1', 'v1', 'v2', 'v2', 'v2', 'v2', 'v2'],
'Revenue':[30, 50, 40, 30, 40, 30, 50, 40, 30, 50],
'Category':['MKL', 'GKL','GKL', 'UKL', 'MKL', 'MKL', 'GKL', 'GKL', 'UKL', 'MKL'],
'Month':[64, 64, 64, 63, 63, 63, 63, 62, 62, 61]
})
print(df)
VIN Revenue Category Month
0 v1 30 MKL 64
1 v1 50 GKL 64
2 v1 40 GKL 64
3 v1 30 UKL 63
4 v1 40 MKL 63
5 v2 30 MKL 63
6 v2 50 GKL 63
7 v2 40 GKL 62
8 v2 30 UKL 62
9 v2 50 MKL 61
# revenue sum by VIN and Month
df_group = df.groupby(['VIN','Month']).agg(list).reset_index()
print(df_group)
VIN Month Revenue Category
0 v1 63 [30, 40] [UKL, MKL]
1 v1 64 [30, 50, 40] [MKL, GKL, GKL]
2 v2 61 [50] [MKL]
3 v2 62 [40, 30] [GKL, UKL]
4 v2 63 [30, 50] [MKL, GKL]
# calculate total revenue by VIN
df_tot = df_group.groupby(['VIN'])['Revenue'].sum().reset_index()
print(df_tot)
VIN Revenue
0 v1 [30, 40, 30, 50, 40]
1 v2 [50, 40, 30, 30, 50]
# merge df_group with df_tot and rename columns
df_merge = pd.merge(df_group, df_tot, on='VIN').rename(columns={'Revenue_x': 'Total Revenue by Month', 'Revenue_y': 'Total Revenue by VIN'})
# sum lists
df_merge['Total Revenue by Month'] = df_merge['Total Revenue by Month'].apply(sum)
df_merge['Total Revenue by VIN'] = df_merge['Total Revenue by VIN'].apply(sum)
print(df_merge)
VIN Month Total Revenue by Month Category Total Revenue by VIN
0 v1 63 70 [UKL, MKL] 190
1 v1 64 120 [MKL, GKL, GKL] 190
2 v2 61 50 [MKL] 200
3 v2 62 70 [GKL, UKL] 200
4 v2 63 80 [MKL, GKL] 200