为总收入和月收入的总和创建列。使用分组依据

Create column for sum of total revenue as well monthly revenue. using group by

我正在尝试创建包含每月总计的列以及每个 VIN 的总计列。请帮助获取输出数据框..

Final_Data = pd.DataFrame(Final_Data.groupby(by='VIN'
                        as_index=False, ,'Month')['Dealers_Revenue'].sum())

当前数据帧 Final_Data:

VIN Revenue Category Month
v1 30 MKL 64
v1 50 GKL 64
v1 40 GKL 64
v1 30 UKL 63
v1 40 MKL 63
v2 30 MKL 63
v2 50 GKL 63
v2 40 GKL 62
v2 30 UKL 62
v2 40 MKL 61

数据框中我想要的输出 Final_Data:

# create DataFrame
df = pd.DataFrame({
    'VIN':['v1', 'v1', 'v1', 'v1', 'v1', 'v2', 'v2', 'v2', 'v2', 'v2'],
    'Revenue':[30, 50, 40, 30, 40, 30, 50, 40, 30, 50], 
    'Category':['MKL', 'GKL','GKL', 'UKL', 'MKL', 'MKL', 'GKL', 'GKL', 'UKL', 'MKL'],
    'Month':[64, 64, 64, 63, 63, 63, 63, 62, 62, 61]
})

print(df)

  VIN  Revenue Category  Month
0  v1       30      MKL     64
1  v1       50      GKL     64
2  v1       40      GKL     64
3  v1       30      UKL     63
4  v1       40      MKL     63
5  v2       30      MKL     63
6  v2       50      GKL     63
7  v2       40      GKL     62
8  v2       30      UKL     62
9  v2       50      MKL     61
# revenue sum by VIN and Month 
df_group = df.groupby(['VIN','Month']).agg(list).reset_index()

print(df_group)

  VIN  Month       Revenue         Category
0  v1     63      [30, 40]       [UKL, MKL]
1  v1     64  [30, 50, 40]  [MKL, GKL, GKL]
2  v2     61          [50]            [MKL]
3  v2     62      [40, 30]       [GKL, UKL]
4  v2     63      [30, 50]       [MKL, GKL]

# calculate total revenue by VIN
df_tot = df_group.groupby(['VIN'])['Revenue'].sum().reset_index()

print(df_tot)

VIN               Revenue
0  v1  [30, 40, 30, 50, 40]
1  v2  [50, 40, 30, 30, 50]
# merge df_group with df_tot and rename columns
df_merge = pd.merge(df_group, df_tot, on='VIN').rename(columns={'Revenue_x': 'Total Revenue by Month', 'Revenue_y': 'Total Revenue by VIN'})

# sum lists
df_merge['Total Revenue by Month'] = df_merge['Total Revenue by Month'].apply(sum)
df_merge['Total Revenue by VIN'] = df_merge['Total Revenue by VIN'].apply(sum)

print(df_merge)

  VIN  Month  Total Revenue by Month         Category  Total Revenue by VIN
0  v1     63                      70       [UKL, MKL]                   190
1  v1     64                     120  [MKL, GKL, GKL]                   190
2  v2     61                      50            [MKL]                   200
3  v2     62                      70       [GKL, UKL]                   200
4  v2     63                      80       [MKL, GKL]                   200