在 pandas muiltiindex 中添加小计/总计行生成元组作为索引
Adding subtotal / grand total row in pandas muiltiindex produces tuple as an index
我有一个 df
:
df = pd.DataFrame.from_dict({'category': {1050: 'Dining',
992: 'Dining',
1054: 'Kitchen',
1052: 'Kitchen',
993: 'Living room',
980: 'Living room',
996: 'Dining',
1017: 'Dining',
1020: 'Bath',
1001: 'Bath'},
'subcategory': {1050: 'Chairs',
992: 'Chairs',
1054: 'Stool',
1052: 'Mirror',
993: 'mirror',
980: 'chair',
996: 'Chairs',
1017: 'Chairs',
1020: 'Table',
1001: 'Table'},
'discount': {1050: '30-40',
992: '30-40',
1054: '30-40',
1052: '30-40',
993: '30-40',
980: '30-40',
996: '30-40',
1017: '30-40',
1020: '30-40',
1001: '30-40'},
'sales_1': {1050: 9539.86,
992: 12971.86,
1054: 6736.53,
1052: 7163.16,
993: 8601.16,
980: 8047.16,
996: 16322.0,
1017: 14424.32,
1020: 6319.58,
1001: 4551.42},
'sales_2': {1050: 3226.0,
992: 11117.0,
1054: 1613.0,
1052: 2166.0,
993: 11117.0,
980: 3442.0,
996: 19365.0,
1017: 3323.0,
1020: 1411.0,
1001: 572.0}})
我正在尝试在多索引中添加小计。
我可以像这样添加 2 个组:
dd = df_from_dict.groupby(['category', 'subcategory'])[['sales_1', 'sales_2']].sum()
s = dd.groupby(level=0).sum()
s.index = pd.MultiIndex.from_product([s.index, ['Total']])
dd = dd.append(s).sort_index()
dd.loc['Grand Total', :] = dd.sum().values / 2
dd
但是当我将第 3 个项目添加到组中时,discount
dd = df_from_dict.groupby(['category', 'subcategory','discount'])[['sales_1', 'sales_2']].sum()
s = dd.groupby(level=0).sum()
s.index = pd.MultiIndex.from_product([s.index, ['Total']])
dd = dd.append(s).sort_index()
dd.loc['Grand Total', :] = dd.sum().values / 2
dd
突然间我得到 tuples
而不是正常的多索引。我得到 1 作为 tuple
:
而不是 3 个索引
而我想要与第一张图片相同的结构,但具有另一级别的索引。我尝试在 group by 中使用 level=1
参数,但它总是以元组形式出现在单个索引中,我不确定我的错误在哪里。
Series
中的问题是 s
是 2 级 MultiIndex,在 dd
中是 3 级,所以在 append
中创建了 tuple
。
解决方案在 MultiIndex.from_product
中设置了 3 个级别 MultiIndex
,因此与 dd
相同的级别和解决方案工作正常:
为避免对 DataFrame.sort_index
中的所有其他级别进行排序,请添加 sort_remaining=False
:
dd = df_from_dict.groupby(['category', 'subcategory','discount'])[['sales_1', 'sales_2']].sum()
s = dd.groupby(level=0).sum()
s.index = pd.MultiIndex.from_product([s.index, ['Total'], ['']])
print (s)
dd = dd.append(s).sort_index(level=0, sort_remaining=False)
dd.loc['Grand Total', :] = dd.sum().values / 2
print (dd)
sales_1 sales_2
category subcategory discount
Bath Table 30-40 10871.00 1983.0
Total 10871.00 1983.0
Dining Chairs 30-40 53258.04 37031.0
Total 53258.04 37031.0
Kitchen Mirror 30-40 7163.16 2166.0
Stool 30-40 6736.53 1613.0
Total 13899.69 3779.0
Living room chair 30-40 8047.16 3442.0
mirror 30-40 8601.16 11117.0
Total 16648.32 14559.0
Grand Total 94677.05 57352.0
我有一个 df
:
df = pd.DataFrame.from_dict({'category': {1050: 'Dining',
992: 'Dining',
1054: 'Kitchen',
1052: 'Kitchen',
993: 'Living room',
980: 'Living room',
996: 'Dining',
1017: 'Dining',
1020: 'Bath',
1001: 'Bath'},
'subcategory': {1050: 'Chairs',
992: 'Chairs',
1054: 'Stool',
1052: 'Mirror',
993: 'mirror',
980: 'chair',
996: 'Chairs',
1017: 'Chairs',
1020: 'Table',
1001: 'Table'},
'discount': {1050: '30-40',
992: '30-40',
1054: '30-40',
1052: '30-40',
993: '30-40',
980: '30-40',
996: '30-40',
1017: '30-40',
1020: '30-40',
1001: '30-40'},
'sales_1': {1050: 9539.86,
992: 12971.86,
1054: 6736.53,
1052: 7163.16,
993: 8601.16,
980: 8047.16,
996: 16322.0,
1017: 14424.32,
1020: 6319.58,
1001: 4551.42},
'sales_2': {1050: 3226.0,
992: 11117.0,
1054: 1613.0,
1052: 2166.0,
993: 11117.0,
980: 3442.0,
996: 19365.0,
1017: 3323.0,
1020: 1411.0,
1001: 572.0}})
我正在尝试在多索引中添加小计。 我可以像这样添加 2 个组:
dd = df_from_dict.groupby(['category', 'subcategory'])[['sales_1', 'sales_2']].sum()
s = dd.groupby(level=0).sum()
s.index = pd.MultiIndex.from_product([s.index, ['Total']])
dd = dd.append(s).sort_index()
dd.loc['Grand Total', :] = dd.sum().values / 2
dd
但是当我将第 3 个项目添加到组中时,discount
dd = df_from_dict.groupby(['category', 'subcategory','discount'])[['sales_1', 'sales_2']].sum()
s = dd.groupby(level=0).sum()
s.index = pd.MultiIndex.from_product([s.index, ['Total']])
dd = dd.append(s).sort_index()
dd.loc['Grand Total', :] = dd.sum().values / 2
dd
突然间我得到 tuples
而不是正常的多索引。我得到 1 作为 tuple
:
而我想要与第一张图片相同的结构,但具有另一级别的索引。我尝试在 group by 中使用 level=1
参数,但它总是以元组形式出现在单个索引中,我不确定我的错误在哪里。
Series
中的问题是 s
是 2 级 MultiIndex,在 dd
中是 3 级,所以在 append
中创建了 tuple
。
解决方案在 MultiIndex.from_product
中设置了 3 个级别 MultiIndex
,因此与 dd
相同的级别和解决方案工作正常:
为避免对 DataFrame.sort_index
中的所有其他级别进行排序,请添加 sort_remaining=False
:
dd = df_from_dict.groupby(['category', 'subcategory','discount'])[['sales_1', 'sales_2']].sum()
s = dd.groupby(level=0).sum()
s.index = pd.MultiIndex.from_product([s.index, ['Total'], ['']])
print (s)
dd = dd.append(s).sort_index(level=0, sort_remaining=False)
dd.loc['Grand Total', :] = dd.sum().values / 2
print (dd)
sales_1 sales_2
category subcategory discount
Bath Table 30-40 10871.00 1983.0
Total 10871.00 1983.0
Dining Chairs 30-40 53258.04 37031.0
Total 53258.04 37031.0
Kitchen Mirror 30-40 7163.16 2166.0
Stool 30-40 6736.53 1613.0
Total 13899.69 3779.0
Living room chair 30-40 8047.16 3442.0
mirror 30-40 8601.16 11117.0
Total 16648.32 14559.0
Grand Total 94677.05 57352.0