在 pandas muiltiindex 中添加小计/总计行生成元组作为索引

Adding subtotal / grand total row in pandas muiltiindex produces tuple as an index

我有一个 df:

df = pd.DataFrame.from_dict({'category': {1050: 'Dining',
  992: 'Dining',
  1054: 'Kitchen',
  1052: 'Kitchen',
  993: 'Living room',
  980: 'Living room',
  996: 'Dining',
  1017: 'Dining',
  1020: 'Bath',
  1001: 'Bath'},
 'subcategory': {1050: 'Chairs',
  992: 'Chairs',
  1054: 'Stool',
  1052: 'Mirror',
  993: 'mirror',
  980: 'chair',
  996: 'Chairs',
  1017: 'Chairs',
  1020: 'Table',
  1001: 'Table'},
 'discount': {1050: '30-40',
  992: '30-40',
  1054: '30-40',
  1052: '30-40',
  993: '30-40',
  980: '30-40',
  996: '30-40',
  1017: '30-40',
  1020: '30-40',
  1001: '30-40'},
 'sales_1': {1050: 9539.86,
  992: 12971.86,
  1054: 6736.53,
  1052: 7163.16,
  993: 8601.16,
  980: 8047.16,
  996: 16322.0,
  1017: 14424.32,
  1020: 6319.58,
  1001: 4551.42},
 'sales_2': {1050: 3226.0,
  992: 11117.0,
  1054: 1613.0,
  1052: 2166.0,
  993: 11117.0,
  980: 3442.0,
  996: 19365.0,
  1017: 3323.0,
  1020: 1411.0,
  1001: 572.0}})

我正在尝试在多索引中添加小计。 我可以像这样添加 2 个组:

dd =  df_from_dict.groupby(['category', 'subcategory'])[['sales_1', 'sales_2']].sum()

s = dd.groupby(level=0).sum()
s.index = pd.MultiIndex.from_product([s.index, ['Total']])
dd = dd.append(s).sort_index()
dd.loc['Grand Total', :] = dd.sum().values / 2

dd

但是当我将第 3 个项目添加到组中时,discount

dd =  df_from_dict.groupby(['category', 'subcategory','discount'])[['sales_1', 'sales_2']].sum()

s = dd.groupby(level=0).sum()
s.index = pd.MultiIndex.from_product([s.index, ['Total']])
dd = dd.append(s).sort_index()
dd.loc['Grand Total', :] = dd.sum().values / 2

dd

突然间我得到 tuples 而不是正常的多索引。我得到 1 作为 tuple:

而不是 3 个索引

而我想要与第一张图片相同的结构,但具有另一级别的索引。我尝试在 group by 中使用 level=1 参数,但它总是以元组形式出现在单个索引中,我不确定我的错误在哪里。

Series 中的问题是 s 是 2 级 MultiIndex,在 dd 中是 3 级,所以在 append 中创建了 tuple

解决方案在 MultiIndex.from_product 中设置了 3 个级别 MultiIndex,因此与 dd 相同的级别和解决方案工作正常:

为避免对 DataFrame.sort_index 中的所有其他级别进行排序,请添加 sort_remaining=False:

dd =  df_from_dict.groupby(['category', 'subcategory','discount'])[['sales_1', 'sales_2']].sum()

s = dd.groupby(level=0).sum()
s.index = pd.MultiIndex.from_product([s.index, ['Total'], ['']])
print (s)

dd = dd.append(s).sort_index(level=0, sort_remaining=False)
dd.loc['Grand Total', :] = dd.sum().values / 2
print (dd)
                                   sales_1  sales_2
category    subcategory discount                   
Bath        Table       30-40     10871.00   1983.0
            Total                 10871.00   1983.0
Dining      Chairs      30-40     53258.04  37031.0
            Total                 53258.04  37031.0
Kitchen     Mirror      30-40      7163.16   2166.0
            Stool       30-40      6736.53   1613.0
            Total                 13899.69   3779.0
Living room chair       30-40      8047.16   3442.0
            mirror      30-40      8601.16  11117.0
            Total                 16648.32  14559.0
Grand Total                       94677.05  57352.0