如何旋转 pandas 数据框(有两个 headers)

How to pivot pandas dataframe (with two headers)

我有一个 Pandas 数据框,如下所示:

Age USA USA USA UK UK UK
Age male female total male female total
2-year-old 2 3 5 1 1 2
3-year-old 8 8 16 7 9 16

事实上我有两行headers(美国+男性;美国+女性;..)

CSV-File (test.csv):

;USA;USA;USA;UK;UK;UK
Age;male;female;total;male;female;total
2-year-old;2;3;5;1;1;2
3-year-old;8;8;16;7;9;16

我的python-code:

df = pd.read_csv('test.csv', 
                 delimiter=";",
                 header=[0,1])

df = df.rename(columns={'Unnamed: 0_level_0': 'Age'})

如何旋转 pandas 数据框以获得以下结果?

Age Country Gender frequency
2-year-old USA male 2
2-year-old USA female 3
2-year-old UK male 1
2-year-old UK female 1
3-year-old USA male 8
3-year-old USA female 8
3-year-old UK male 7
3-year-old UK female 9

编辑:

开始 Table:

Kode Country Procedure male male female female
Kode Country Procedure two-year-old three-year-old two-year-old three-year-old
1a US proc_1 4 6 3 6
1a UK proc_1 2 3 5 1
1b US proc_2 15 3 5 2
1b UK proc_2 8 4 7 3

CSV:

Code;Country;Procedure;male;male;female;female
Code;Country;Procedure;two-year-old;three-year-old;two-year-old;three-year-old
1a;US;proc_1;4;6;3;6
1a;UK;proc_1;2;3;5;1
1b;US;proc_2;15;3;5;2
1b;UK;proc_2;8;4;7;3

结果Table:

Code Country Procedure Gender Age Frequency
1a US proc_1 male two-year-old 4
1a US proc_1 male three-year-old 6
1a US proc_1 female two-year-old 3
1a US proc_1 female three-year-old 6
1a UK proc_1 male two-year-old 2
1a UK proc_1 male three-year-old 3
1a UK proc_1 female two-year-old 5
1a UK proc_1 female three-year-old 1
1b ...

使用DataFrame.set_index with DataFrame.stack,如果需要删除总计添加drop:

df = (df.drop('total', axis=1, level=1)
        .set_index(df.columns[0])
        .stack([0,1])
        .rename_axis(['Age','Country','Gender'])
        .reset_index(name='frequency'))
print (df)
          Age Country  Gender  frequency
0  2-year-old      UK  female          1
1  2-year-old      UK    male          1
2  2-year-old     USA  female          3
3  2-year-old     USA    male          2
4  3-year-old      UK  female          9
5  3-year-old      UK    male          7
6  3-year-old     USA  female          8
7  3-year-old     USA    male          8

或者:

df = (df.set_index(df.columns[0])
        .stack([0,1])
        .rename_axis(['Age','Country','Gender'])
        .reset_index(name='frequency'))
print (df)
           Age Country  Gender  frequency
0   2-year-old      UK  female          1
1   2-year-old      UK    male          1
2   2-year-old      UK   total          2
3   2-year-old     USA  female          3
4   2-year-old     USA    male          2
5   2-year-old     USA   total          5
6   3-year-old      UK  female          9
7   3-year-old      UK    male          7
8   3-year-old      UK   total         16
9   3-year-old     USA  female          8
10  3-year-old     USA    male          8
11  3-year-old     USA   total         16