pandas 排列范围内的值以创建 bin

Question

我有一个如下所示的数据框

stu_id,Mat_score,sci_score,Eng_score
1,1,1,1
2,1,5,1
3,5,5,5
4,1,2,5
5,4,5,5
6,3,3,3
7,1,1,3
8,3,3,1
9,1,1,5
10,3,4,3

df1 = pd.read_clipboard(sep=',')

我想根据以下条件创建一个名为 study_group 的新列

如果学生的分数为 5,5,5 和 4,4,4 或 4,5,5 或 4,5,4 或 5,4,4 然后将他分配给 study_group champion.

如果学生的分数为 1,1,1 或 1,2,1 或 2,2,1 或 1,1,2 或 2,2,2 或 2,1,1 等，则分配给他 lost

如果学生的分数为3,3,3或3,2,3或3,4,3或3,5,3等，则分配给他moderate to good

如果学生的分数为3,1,1或1,3,3或3,1,3或1,1,3等，则分配给他poor to moderate

因此，如果分数不在我上面给出的任何范围内，那么它们应该分配给 Other

所以，我正在尝试下面的方法

study_group = []
for row in df1.iterrows():
            rec = row[1]
            m = rec['Mat_score']
            s = rec['sci_score']
            e = rec['Eng_score']
            if (m in (4,5)) & (s in (4,5)) & (e in (4,5)):
                study_group.append({rec['stu_id']:'Champion'})
            elif (m in (1,2)) & (s in (1,2)) & (e in (1,2)):
                study_group.append({rec['stu_id']:'Lost'})
            elif (m in (3)) & (s in (2,3,4,5)) & (e in (3)):
                study_group.append({rec['stu_id']:'moderate to good'})
            elif (m in (1,3)) & (s in (1,3)) & (e in (1,3)):
                study_group.append({rec['stu_id']:'Poor to moderate'})
            else:
                study_group.append({rec['stu_id']:'Other'})

但是不确定上面的代码是否优雅高效。我必须为多个不同的组合编写 if-else

有没有其他高效优雅的方法来完成上述操作？

我希望我的输出如下所示

Answer 1

你可以试试np.select

score = df.filter(like='score')

df['study_group'] = np.select(
    [score.isin([4,5]).all(axis=1),
     score.isin([1,2]).all(axis=1),
     (score[['Mat_score', 'Eng_score']].eq(3).all(axis=1) & score['sci_score'].ne(1)),
     score.isin([1,3]).all(axis=1)],
    ['Champion',
     'Lost',
     'moderate to good',
     'Poor to moderate'],
    default='Other'
)

print(df)

   stu_id  Mat_score  sci_score  Eng_score       study_group
0       1          1          1          1              Lost
1       2          1          5          1             Other
2       3          5          5          5          Champion
3       4          1          2          5             Other
4       5          4          5          5          Champion
5       6          3          3          3  moderate to good
6       7          1          1          3  Poor to moderate
7       8          3          3          1  Poor to moderate
8       9          1          1          5             Other
9      10          3          4          3  moderate to good

pandas 排列范围内的值以创建 bin

pandas permute values within range to create bins

python

numpy

list

dataframe

pandas