Python Pandas 值等于特定列的简单函数的主元

Python Pandas pivot with values equal to simple function of specific column

import pandas as pd
olympics = pd.read_csv('olympics.csv')

    Edition  NOC   Medal
0      1896  AUT  Silver
1      1896  FRA    Gold
2      1896  GER    Gold
3      1900  HUN  Bronze
4      1900  GBR    Gold
5      1900  DEN  Bronze
6      1900  USA    Gold
7      1900  FRA  Bronze
8      1900  FRA  Silver
9      1900  USA    Gold
10     1900  FRA  Silver
11     1900  GBR    Gold
12     1900  SUI  Silver
13     1900  ZZX    Gold
14     1904  HUN    Gold
15     1904  USA  Bronze
16     1904  USA    Gold
17     1904  USA  Silver
18     1904  CAN    Gold
19     1904  USA  Silver

我可以旋转数据框以获得一些聚合值

pivot = olympics.pivot_table(index='Edition', columns='NOC', values='Medal', aggfunc='count')

NOC      AUT  CAN  DEN  FRA  GBR  GER  HUN  SUI  USA  ZZX
Edition                                                  
1896     1.0  NaN  NaN  1.0  NaN  1.0  NaN  NaN  NaN  NaN
1900     NaN  NaN  1.0  3.0  2.0  NaN  1.0  1.0  2.0  1.0
1904     NaN  1.0  NaN  NaN  NaN  NaN  1.0  NaN  4.0  NaN

而不是 values= 中的奖牌总数,我有兴趣有一个元组(一个三元组)(#Gold,#Silver,#Bronze), (0,0,0) 对于 NaN

如何简洁优雅地做到这一点?

无需使用 pivot_table,因为对于值

,枢轴与元组完全匹配
  • value_counts 计算所有奖牌
  • 为国家、日期、奖牌的所有组合创建multi-index
  • reindexfill_values=0

counts = df.groupby(['Edition', 'NOC']).Medal.value_counts()

mux = pd.MultiIndex.from_product(
    [c.values for c in counts.index.levels], names=counts.index.names)
counts = counts.reindex(mux, fill_value=0).unstack('Medal')
counts = counts[['Bronze', 'Silver', 'Gold']]

pd.Series([tuple(l) for l in counts.values.tolist()], counts.index).unstack()