Pandas 按多列分组并计算标准差
Pandas Group By Multiple Colums and Calculate Standard Deviation
我有一个 pandas 数据框,其中包含来自多个赛季和球队的 NBA 篮球运动员的统计数据。它看起来像这样:
Year Team Player PTS/G
2018 Lakers Lebron James 27.6
2018 Lakers Kyle Kuzma 10.3
2019 Rockets James Harden 25.5
2019 Rockets Russel Westbrook 23.2
我想创建一个名为 'PTS Dev' 的新列,它是每个团队和年份的 PTS/G 的标准差。然后,我计划根据该偏差分析玩家的位置。这是我计算该列的尝试:
final_data['PTS Dev'] = final_data.groupby('Team', 'Year')['PTS/G'].std()
final_data['PTS Dev'] = final_data.groupby(['Team', 'Year'])['PTS/G'].transform('std')
final_data
Out[9]:
Year Team Player PTS/G PTS Dev
0 2018 Lakers Lebron James 27.6 12.232947
1 2018 Lakers Kyle Kuzma 10.3 12.232947
2 2019 Rockets James Harden 25.5 1.626346
3 2019 Rockets Russel Westbrook 23.2 1.626346
我有一个 pandas 数据框,其中包含来自多个赛季和球队的 NBA 篮球运动员的统计数据。它看起来像这样:
Year Team Player PTS/G
2018 Lakers Lebron James 27.6
2018 Lakers Kyle Kuzma 10.3
2019 Rockets James Harden 25.5
2019 Rockets Russel Westbrook 23.2
我想创建一个名为 'PTS Dev' 的新列,它是每个团队和年份的 PTS/G 的标准差。然后,我计划根据该偏差分析玩家的位置。这是我计算该列的尝试:
final_data['PTS Dev'] = final_data.groupby('Team', 'Year')['PTS/G'].std()
final_data['PTS Dev'] = final_data.groupby(['Team', 'Year'])['PTS/G'].transform('std')
final_data
Out[9]:
Year Team Player PTS/G PTS Dev
0 2018 Lakers Lebron James 27.6 12.232947
1 2018 Lakers Kyle Kuzma 10.3 12.232947
2 2019 Rockets James Harden 25.5 1.626346
3 2019 Rockets Russel Westbrook 23.2 1.626346