无法将小计插入 pandas 数据框
Cannot insert subtotals into pandas dataframe
我对 Python 和 Pandas 比较陌生。在 Google 和 Whosebug 的帮助下,我已经能够获得我所追求的大部分内容。然而,这个让我难住了。我有一个看起来像这样的数据框:
SalesPerson 1 SalesPerson 2 SalesPerson 3
Revenue Number of Orders Revenue Number of Orders Revenue Number of Orders
In Process Stage 1 8347 8 9941 5 5105 7
In Process Stage 2 3879 2 3712 3 1350 10
In Process Stage 3 7885 4 6513 8 2218 2
Won Not Invoiced 4369 1 1736 5 4950 9
Won Invoiced 7169 5 5308 3 9832 2
Lost to Competitor 8780 1 3836 7 2851 3
Lost to No Action 2835 5 4653 1 1270 2
我想为 In Process、Won 和 Lost 添加小计行,以便我的数据如下所示:
SalesPerson 1 SalesPerson 2 SalesPerson 3
Revenue Number of Orders Revenue Number of Orders Revenue Number of Orders
In Process Stage 1 8347 8 9941 5 5105 7
In Process Stage 2 3879 2 3712 3 1350 10
In Process Stage 3 7885 4 6513 8 2218 2
In Process Subtotal 20111 14 20166 16 8673 19
Won Not Invoiced 4369 1 1736 5 4950 9
Won Invoiced 7169 5 5308 3 9832 2
Won Subtotal 11538 6 7044 8 14782 11
Won Percent 27% 23% 20% 25% 54% 31%
Lost to Competitor 8780 1 3836 7 2851 3
Lost to No Action 2835 5 4653 1 1270 2
Lost Subtotal 11615 6 8489 8 4121 5
Lost Percent 27% 23% 24% 25% 15% 14%
Total 43264 26 35699 32 27576 35
到目前为止,我的代码如下所示:
def create_win_lose_table(dataframe):
in_process_stagename_list = {'In Process Stage 1', 'In Process Stage 2', 'In Process Stage 3'}
won_stagename_list = {'Won Invoiced', 'Won Not Invoiced'}
lost_stagename_list = {'Lost to Competitor', 'Lost to No Action'}
temp_Pipeline_df = dataframe.copy()
for index, row in temp_Pipeline_df.iterrows():
if index not in in_process_stagename_list:
temp_Pipeline_df.drop([index], inplace = True)
Pipeline_sum = temp_Pipeline_df.sum()
#at the end I was going to concat the sum to the original dataframe, but that's where I'm stuck
我才开始处理进程中的数据框。我的想法是,一旦我弄明白了,我就可以为“赢”和“输”类别复制该过程。欢迎任何想法或方法。
谢谢!
乔恩
为您提供的简单示例。
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(5, 5))
df_total = pd.DataFrame(np.sum(df.iloc[:, :].values, axis=0)).transpose()
df_with_totals = df.append(df_total)
df_with_totals
0 1 2 3 4
0 0.743746 0.668769 0.894739 0.947641 0.753029
1 0.236587 0.862352 0.329624 0.637625 0.288876
2 0.817637 0.250593 0.363517 0.572789 0.785234
3 0.140941 0.221746 0.673470 0.792831 0.170667
4 0.965435 0.836577 0.790037 0.996000 0.229857
0 2.904346 2.840037 3.051388 3.946885 2.227662
您可以使用 Pandas 中的重命名参数来随意调用摘要行。
我对 Python 和 Pandas 比较陌生。在 Google 和 Whosebug 的帮助下,我已经能够获得我所追求的大部分内容。然而,这个让我难住了。我有一个看起来像这样的数据框:
SalesPerson 1 SalesPerson 2 SalesPerson 3
Revenue Number of Orders Revenue Number of Orders Revenue Number of Orders
In Process Stage 1 8347 8 9941 5 5105 7
In Process Stage 2 3879 2 3712 3 1350 10
In Process Stage 3 7885 4 6513 8 2218 2
Won Not Invoiced 4369 1 1736 5 4950 9
Won Invoiced 7169 5 5308 3 9832 2
Lost to Competitor 8780 1 3836 7 2851 3
Lost to No Action 2835 5 4653 1 1270 2
我想为 In Process、Won 和 Lost 添加小计行,以便我的数据如下所示:
SalesPerson 1 SalesPerson 2 SalesPerson 3
Revenue Number of Orders Revenue Number of Orders Revenue Number of Orders
In Process Stage 1 8347 8 9941 5 5105 7
In Process Stage 2 3879 2 3712 3 1350 10
In Process Stage 3 7885 4 6513 8 2218 2
In Process Subtotal 20111 14 20166 16 8673 19
Won Not Invoiced 4369 1 1736 5 4950 9
Won Invoiced 7169 5 5308 3 9832 2
Won Subtotal 11538 6 7044 8 14782 11
Won Percent 27% 23% 20% 25% 54% 31%
Lost to Competitor 8780 1 3836 7 2851 3
Lost to No Action 2835 5 4653 1 1270 2
Lost Subtotal 11615 6 8489 8 4121 5
Lost Percent 27% 23% 24% 25% 15% 14%
Total 43264 26 35699 32 27576 35
到目前为止,我的代码如下所示:
def create_win_lose_table(dataframe):
in_process_stagename_list = {'In Process Stage 1', 'In Process Stage 2', 'In Process Stage 3'}
won_stagename_list = {'Won Invoiced', 'Won Not Invoiced'}
lost_stagename_list = {'Lost to Competitor', 'Lost to No Action'}
temp_Pipeline_df = dataframe.copy()
for index, row in temp_Pipeline_df.iterrows():
if index not in in_process_stagename_list:
temp_Pipeline_df.drop([index], inplace = True)
Pipeline_sum = temp_Pipeline_df.sum()
#at the end I was going to concat the sum to the original dataframe, but that's where I'm stuck
我才开始处理进程中的数据框。我的想法是,一旦我弄明白了,我就可以为“赢”和“输”类别复制该过程。欢迎任何想法或方法。
谢谢! 乔恩
为您提供的简单示例。
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(5, 5))
df_total = pd.DataFrame(np.sum(df.iloc[:, :].values, axis=0)).transpose()
df_with_totals = df.append(df_total)
df_with_totals
0 1 2 3 4
0 0.743746 0.668769 0.894739 0.947641 0.753029
1 0.236587 0.862352 0.329624 0.637625 0.288876
2 0.817637 0.250593 0.363517 0.572789 0.785234
3 0.140941 0.221746 0.673470 0.792831 0.170667
4 0.965435 0.836577 0.790037 0.996000 0.229857
0 2.904346 2.840037 3.051388 3.946885 2.227662
您可以使用 Pandas 中的重命名参数来随意调用摘要行。