如何将 pandas 多索引列数据框与单个索引数据框连接起来

How to concatenate a pandas multi index column dataframe with a single index dataframe

我有以下代码,我试图通过聚合在数据透视表 table 上执行分组,并将生成的聚合连接回数据透视表 table 数据帧。但我在加入 table 个不同级别时遇到问题。

import pandas as pd

data = [
        ["alice", "school 1", "math", 95],
        ["alice", "school 1", "science", 87],
        ["charlie", "school 1", "math", 72],
        ["charlie", "school 1", "science", 63],
        ["bob", "school 2", "math", 92],
        ["bob", "school 2", "science", 68],
        ["dale", "school 2", "math", 56],
        ["dale", "school 2", "science", 78],
]

df = pd.DataFrame(data, columns =["student_name", "school", "class", "class score"])

pvt = pd.pivot_table(df, index=["class"], columns=["school", "student_name"])
print(pvt)
print()

aggregate_sum = pvt.groupby(level=1, axis=1).sum()
print(aggregate_sum)

枢轴Table输出:

             class score
school          school 1         school 2
student_name       alice charlie      bob dale
class
math                  95      72       92   56
science               87      63       68   78

总输出:

school   school 1  school 2
class
math          167       148
science       150       146

如何将聚合输出连接到与学生姓名相同级别的主元table?

预期输出:

             class score
school          school 1                   school 2
student_name       alice charlie  sum      bob dale  sum
class
math                  95      72  167      92   56   148
science               87      63  150      68   78   176

merge 合并并更新 multi-column 名称,然后使用 pd.MultiIndex.from_tuples() 创建 multi-column 以更新合并后的 multi-column.

final = pvt.merge(aggregate_sum, on='class', how='inner')
final = final.rename(columns={'school 1':('class score','school 1','sum'), 'school 2':('class score','school 2','sum')})
cols = final.columns
index = pd.MultiIndex.from_tuples(cols)
final.columns = index
final = (final[[('class score','school 1','alice'),('class score', 'school 1', 'charlie'),
                ('class score','school 1','sum'),('class score', 'school 2','bob'),
                ('class score', 'school 2','dale'),('class score', 'school 2','sum')]])
final
        class score
        school 1    school 2
alice   charlie sum bob dale    sum
class                       
math    95  72  167 92  56  148
science 87  63  150 68  78  146