按 pandas 范围内的数据分组

Group data by ranges in pandas

我有一个 df,如图所示:

Value
1
2
3
4
5
4
5
5
6
6
7
7
8
8
9
9

现在我想把这个df分成5类,即按分数范围

0-2: Very Low
2-4: Low
4-6: Medium
6-8: High
8-10:Very High

因此,得到的 df 应为:

Value   Band
1       Very Low
2       Low 
3       Low
4       Med
5       Med
4       Med
5       Med
5       Med
6       High
6       High
7       High
7       High
8       VeryHigh
8       VeryHigh
9       VeryHigh
9       Very High

我知道我可以在 pandas 中使用 groupby 对列中的值进行分组,但我如何分组并将其分为 5 个类别,如上所示

您可以使用pd.cut,例如:

labels = ["Very Low", "Low", "Medium", "High", "Very High"]

df["Band"] = pd.cut(df["Value"], len(labels), labels=labels)
print(df)

打印:

    Value       Band
0       1   Very Low
1       2   Very Low
2       3        Low
3       4        Low
4       5     Medium
5       4        Low
6       5     Medium
7       5     Medium
8       6       High
9       6       High
10      7       High
11      7       High
12      8  Very High
13      8  Very High
14      9  Very High
15      9  Very High

注意:如果标签不对,您可以定义自己的bins(例如以列表的形式)

import pandas as pd

df = pd.DataFrame(list(range(10)),columns=['value'])
df['Band'] = pd.cut(df['value'],bins=[-1,2,4,6,8,10],labels= ['Very Low','low','Medium','High','Very High'])

结果:

    value   Band
0   0   Very Low
1   1   Very Low
2   2   Very Low
3   3   low
4   4   low
5   5   Medium
6   6   Medium
7   7   High
8   8   High
9   9   Very High