从 groupby 添加总行数 Pandas
Add total rows from groupby Pandas
目前我的 csv 是这样的:
title
field1
field2
field3
A
A1
A11
553
A
A1
A12
94
A
A1
A13
30
A
A2
A21
200
A
A3
A31
35
但我希望它看起来像这样:
title
field1
field2
field3
A
A1
A11
553
A
A1
A12
94
A
A1
A13
30
A
A1
total
657
A
A2
A21
200
A
A2
total
200
A
A3
A31
35
A
A3
total
35
这是我的代码:
def fun(df, cols_to_aggregate, cols_order):
df = df.groupby(['field1', 'field2'], as_index=False)\
.agg(cols_to_aggregate)
df['title'] = 'A'
df = df[cols_order]
return df
def create_csv(df, month_date):
cols_to_aggregate = {'field3': 'sum'}
cols_order = ['title', 'field1', 'field2', 'field3']
funCSV = fun(df, cols_to_aggregate, cols_order)
return funCSV
任何帮助将不胜感激,因为我不知道如何将新行添加到 table。我试过这个:
total = df.groupby('field2')['field3'].sum()
但它只是将数字添加到 table 的末尾,而不是将它们与其他相关字段整合到 table 中。
使用 concat
对聚合 DataFrame 的两列进行排序:
def fun(df, cols_to_aggregate, cols_order):
df = df.groupby(['field1', 'field2'], as_index=False)\
.agg(cols_to_aggregate)
total = df.groupby('field1', as_index=False)['field3'].sum().assign(field2='total')
df = pd.concat([df, total]).sort_values(['field1','field2'], ignore_index=True)
df['title'] = 'A'
print (df)
df = df[cols_order]
return df
目前我的 csv 是这样的:
title | field1 | field2 | field3 |
---|---|---|---|
A | A1 | A11 | 553 |
A | A1 | A12 | 94 |
A | A1 | A13 | 30 |
A | A2 | A21 | 200 |
A | A3 | A31 | 35 |
但我希望它看起来像这样:
title | field1 | field2 | field3 |
---|---|---|---|
A | A1 | A11 | 553 |
A | A1 | A12 | 94 |
A | A1 | A13 | 30 |
A | A1 | total | 657 |
A | A2 | A21 | 200 |
A | A2 | total | 200 |
A | A3 | A31 | 35 |
A | A3 | total | 35 |
这是我的代码:
def fun(df, cols_to_aggregate, cols_order):
df = df.groupby(['field1', 'field2'], as_index=False)\
.agg(cols_to_aggregate)
df['title'] = 'A'
df = df[cols_order]
return df
def create_csv(df, month_date):
cols_to_aggregate = {'field3': 'sum'}
cols_order = ['title', 'field1', 'field2', 'field3']
funCSV = fun(df, cols_to_aggregate, cols_order)
return funCSV
任何帮助将不胜感激,因为我不知道如何将新行添加到 table。我试过这个:
total = df.groupby('field2')['field3'].sum()
但它只是将数字添加到 table 的末尾,而不是将它们与其他相关字段整合到 table 中。
使用 concat
对聚合 DataFrame 的两列进行排序:
def fun(df, cols_to_aggregate, cols_order):
df = df.groupby(['field1', 'field2'], as_index=False)\
.agg(cols_to_aggregate)
total = df.groupby('field1', as_index=False)['field3'].sum().assign(field2='total')
df = pd.concat([df, total]).sort_values(['field1','field2'], ignore_index=True)
df['title'] = 'A'
print (df)
df = df[cols_order]
return df