按 pandas 范围内的数据分组
Group data by ranges in pandas
我有一个 df,如图所示:
Value
1
2
3
4
5
4
5
5
6
6
7
7
8
8
9
9
现在我想把这个df分成5类,即按分数范围
0-2: Very Low
2-4: Low
4-6: Medium
6-8: High
8-10:Very High
因此,得到的 df 应为:
Value Band
1 Very Low
2 Low
3 Low
4 Med
5 Med
4 Med
5 Med
5 Med
6 High
6 High
7 High
7 High
8 VeryHigh
8 VeryHigh
9 VeryHigh
9 Very High
我知道我可以在 pandas 中使用 groupby 对列中的值进行分组,但我如何分组并将其分为 5 个类别,如上所示
您可以使用pd.cut
,例如:
labels = ["Very Low", "Low", "Medium", "High", "Very High"]
df["Band"] = pd.cut(df["Value"], len(labels), labels=labels)
print(df)
打印:
Value Band
0 1 Very Low
1 2 Very Low
2 3 Low
3 4 Low
4 5 Medium
5 4 Low
6 5 Medium
7 5 Medium
8 6 High
9 6 High
10 7 High
11 7 High
12 8 Very High
13 8 Very High
14 9 Very High
15 9 Very High
注意:如果标签不对,您可以定义自己的bins(例如以列表的形式)
import pandas as pd
df = pd.DataFrame(list(range(10)),columns=['value'])
df['Band'] = pd.cut(df['value'],bins=[-1,2,4,6,8,10],labels= ['Very Low','low','Medium','High','Very High'])
结果:
value Band
0 0 Very Low
1 1 Very Low
2 2 Very Low
3 3 low
4 4 low
5 5 Medium
6 6 Medium
7 7 High
8 8 High
9 9 Very High
我有一个 df,如图所示:
Value
1
2
3
4
5
4
5
5
6
6
7
7
8
8
9
9
现在我想把这个df分成5类,即按分数范围
0-2: Very Low
2-4: Low
4-6: Medium
6-8: High
8-10:Very High
因此,得到的 df 应为:
Value Band
1 Very Low
2 Low
3 Low
4 Med
5 Med
4 Med
5 Med
5 Med
6 High
6 High
7 High
7 High
8 VeryHigh
8 VeryHigh
9 VeryHigh
9 Very High
我知道我可以在 pandas 中使用 groupby 对列中的值进行分组,但我如何分组并将其分为 5 个类别,如上所示
您可以使用pd.cut
,例如:
labels = ["Very Low", "Low", "Medium", "High", "Very High"]
df["Band"] = pd.cut(df["Value"], len(labels), labels=labels)
print(df)
打印:
Value Band
0 1 Very Low
1 2 Very Low
2 3 Low
3 4 Low
4 5 Medium
5 4 Low
6 5 Medium
7 5 Medium
8 6 High
9 6 High
10 7 High
11 7 High
12 8 Very High
13 8 Very High
14 9 Very High
15 9 Very High
注意:如果标签不对,您可以定义自己的bins(例如以列表的形式)
import pandas as pd
df = pd.DataFrame(list(range(10)),columns=['value'])
df['Band'] = pd.cut(df['value'],bins=[-1,2,4,6,8,10],labels= ['Very Low','low','Medium','High','Very High'])
结果:
value Band
0 0 Very Low
1 1 Very Low
2 2 Very Low
3 3 low
4 4 low
5 5 Medium
6 6 Medium
7 7 High
8 8 High
9 9 Very High