使用 groupby 将函数应用于 Pandas 数据框('Too many indexers' 错误)
Applying function to Pandas dataframe with groupby ('Too many indexers' error)
我正在尝试仅使用前 k 列(计算为 .iloc[:,:-5]
)沿着 axis=1
计算 mean
和 var
,天真地,我会 运行 作为:
df.groupby('id').agg([lambda x: x.iloc[:,:-5].mean(axis=1), lambda x: x.iloc[:,:-5].var(axis=1)])
但它抛出 'too many indexers' 错误。
编辑
一些数据:
0 1 2 3 4 5 6 7 8 9 Q1 Q2 Q3 Q4 id
0 3.0 3.0 4.0 4.0 3.0 3.0 3.0 3.0 3.0 3.0 12.0 0.83 80.0 1.000 11.0
1 3.0 3.0 4.0 4.0 4.0 3.0 3.0 3.0 3.0 3.0 14.0 1.60 80.0 1.000 11.0
2 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 5.0 13.0 1.40 75.0 1.000 11.0
3 3.0 3.0 4.0 4.0 4.0 3.0 3.0 3.0 3.0 3.0 12.0 0.50 80.0 0.848 11.0
4 3.0 4.0 4.0 4.0 7.0 7.0 5.0 4.0 4.0 2.0 13.0 1.74 70.0 0.883 11.0
13 3.0 3.0 2.0 2.0 2.0 2.0 3.0 2.0 3.0 3.0 12.0 3.67 45.0 1.000 14.0
14 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 13.0 3.67 48.0 0.848 14.0
15 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 12.0 1.67 70.0 0.848 14.0
16 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 NaN 2.0 12.0 3.33 60.0 0.848 14.0
17 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 12.0 3.33 60.0 0.848 14.0
25 4.0 4.0 6.0 5.0 NaN 6.0 4.0 3.0 NaN 4.0 11.0 3.36 85.0 0.796 17.0
26 4.0 5.0 4.0 7.0 6.0 5.0 4.0 6.0 7.0 5.0 8.0 4.76 50.0 0.725 17.0
27 4.0 4.0 3.0 4.0 5.0 4.0 5.0 3.0 3.0 5.0 9.0 3.33 50.0 0.725 17.0
28 3.0 4.0 4.0 3.0 4.0 4.0 NaN 3.0 NaN 3.0 10.0 3.12 75.0 0.725 17.0
29 3.0 3.0 2.0 NaN 2.0 1.0 NaN NaN 1.0 2.0 15.0 3.05 79.0 0.725 17.0
39 3.0 3.0 5.0 4.0 4.0 4.0 4.0 4.0 NaN 5.0 12.0 1.19 80.0 0.883 18.0
40 5.0 4.0 5.0 5.0 5.0 5.0 4.0 5.0 7.0 4.0 9.0 1.83 75.0 0.883 18.0
41 5.0 6.0 4.0 4.0 4.0 4.0 4.0 4.0 7.0 7.0 12.0 1.71 35.0 1.000 18.0
42 5.0 5.0 5.0 5.0 4.0 NaN 4.0 4.0 3.0 2.0 13.0 0.86 85.0 1.000 18.0
43 3.0 3.0 3.0 3.0 3.0 3.0 3.0 5.0 3.0 3.0 11.0 1.36 75.0 1.000 18.0
48 1
看来你首先需要:
df['m'] = df.iloc[:,:-5].mean(axis=1)
df['v'] = df.iloc[:,:-5].var(axis=1)
然后根据需要汇总。
我正在尝试仅使用前 k 列(计算为 .iloc[:,:-5]
)沿着 axis=1
计算 mean
和 var
,天真地,我会 运行 作为:
df.groupby('id').agg([lambda x: x.iloc[:,:-5].mean(axis=1), lambda x: x.iloc[:,:-5].var(axis=1)])
但它抛出 'too many indexers' 错误。
编辑
一些数据:
0 1 2 3 4 5 6 7 8 9 Q1 Q2 Q3 Q4 id
0 3.0 3.0 4.0 4.0 3.0 3.0 3.0 3.0 3.0 3.0 12.0 0.83 80.0 1.000 11.0
1 3.0 3.0 4.0 4.0 4.0 3.0 3.0 3.0 3.0 3.0 14.0 1.60 80.0 1.000 11.0
2 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 5.0 13.0 1.40 75.0 1.000 11.0
3 3.0 3.0 4.0 4.0 4.0 3.0 3.0 3.0 3.0 3.0 12.0 0.50 80.0 0.848 11.0
4 3.0 4.0 4.0 4.0 7.0 7.0 5.0 4.0 4.0 2.0 13.0 1.74 70.0 0.883 11.0
13 3.0 3.0 2.0 2.0 2.0 2.0 3.0 2.0 3.0 3.0 12.0 3.67 45.0 1.000 14.0
14 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 13.0 3.67 48.0 0.848 14.0
15 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 12.0 1.67 70.0 0.848 14.0
16 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 NaN 2.0 12.0 3.33 60.0 0.848 14.0
17 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 12.0 3.33 60.0 0.848 14.0
25 4.0 4.0 6.0 5.0 NaN 6.0 4.0 3.0 NaN 4.0 11.0 3.36 85.0 0.796 17.0
26 4.0 5.0 4.0 7.0 6.0 5.0 4.0 6.0 7.0 5.0 8.0 4.76 50.0 0.725 17.0
27 4.0 4.0 3.0 4.0 5.0 4.0 5.0 3.0 3.0 5.0 9.0 3.33 50.0 0.725 17.0
28 3.0 4.0 4.0 3.0 4.0 4.0 NaN 3.0 NaN 3.0 10.0 3.12 75.0 0.725 17.0
29 3.0 3.0 2.0 NaN 2.0 1.0 NaN NaN 1.0 2.0 15.0 3.05 79.0 0.725 17.0
39 3.0 3.0 5.0 4.0 4.0 4.0 4.0 4.0 NaN 5.0 12.0 1.19 80.0 0.883 18.0
40 5.0 4.0 5.0 5.0 5.0 5.0 4.0 5.0 7.0 4.0 9.0 1.83 75.0 0.883 18.0
41 5.0 6.0 4.0 4.0 4.0 4.0 4.0 4.0 7.0 7.0 12.0 1.71 35.0 1.000 18.0
42 5.0 5.0 5.0 5.0 4.0 NaN 4.0 4.0 3.0 2.0 13.0 0.86 85.0 1.000 18.0
43 3.0 3.0 3.0 3.0 3.0 3.0 3.0 5.0 3.0 3.0 11.0 1.36 75.0 1.000 18.0
48 1
看来你首先需要:
df['m'] = df.iloc[:,:-5].mean(axis=1)
df['v'] = df.iloc[:,:-5].var(axis=1)
然后根据需要汇总。