如何使用 pandas 的 loc?
How to use loc from pandas?
我有这段代码可以将年龄从数字数据替换为分类数据。我正在尝试那样做,但它不起作用。有人可以帮助我吗?
for df in treino_teste:
df.loc[df['Age'] <= 13, 'Age'] = 0,
df.loc[(df['Age'] > 13) & (df['Age'] <= 18), 'Age'] = 1,
df.loc[(df['Age'] > 18) & (df['Age'] <= 25), 'Age'] = 2,
df.loc[(df['Age'] > 25) & (df['Age'] <= 35), 'Age'] = 3,
df.loc[(df['Age'] > 35) & (df['Age'] <= 60), 'Age'] = 4,
df.loc[df['Age'] > 60, 'Age'] = 5
错误:
- 可以对连续数据进行分类
- 例如,我已将 bin 分配给一个新列。我本可以将它分配回 Age
- 为了方便阅读结果我已经排序了,这个不需要
df = pd.DataFrame({"Age":np.random.randint(1,65,10)}).sort_values(["Age"])
bins = [0,13,18,25,35,60,100]
df.assign(AgeB=pd.cut(df.Age, bins=bins, labels=[i for i,v in enumerate(bins[:-1])]))
Age
AgeB
5
12
0
3
13
0
8
18
1
7
25
2
9
25
2
1
27
3
2
30
3
4
57
4
0
59
4
6
64
5
您可以使用 numpy.digitize()
bins = [0,13,18,25,35,60,100]
df['AgeC'] =numpy.digitize(df['Age'],bins)
我有这段代码可以将年龄从数字数据替换为分类数据。我正在尝试那样做,但它不起作用。有人可以帮助我吗?
for df in treino_teste:
df.loc[df['Age'] <= 13, 'Age'] = 0,
df.loc[(df['Age'] > 13) & (df['Age'] <= 18), 'Age'] = 1,
df.loc[(df['Age'] > 18) & (df['Age'] <= 25), 'Age'] = 2,
df.loc[(df['Age'] > 25) & (df['Age'] <= 35), 'Age'] = 3,
df.loc[(df['Age'] > 35) & (df['Age'] <= 60), 'Age'] = 4,
df.loc[df['Age'] > 60, 'Age'] = 5
错误:
- 可以对连续数据进行分类
- 例如,我已将 bin 分配给一个新列。我本可以将它分配回 Age
- 为了方便阅读结果我已经排序了,这个不需要
df = pd.DataFrame({"Age":np.random.randint(1,65,10)}).sort_values(["Age"])
bins = [0,13,18,25,35,60,100]
df.assign(AgeB=pd.cut(df.Age, bins=bins, labels=[i for i,v in enumerate(bins[:-1])]))
Age | AgeB | |
---|---|---|
5 | 12 | 0 |
3 | 13 | 0 |
8 | 18 | 1 |
7 | 25 | 2 |
9 | 25 | 2 |
1 | 27 | 3 |
2 | 30 | 3 |
4 | 57 | 4 |
0 | 59 | 4 |
6 | 64 | 5 |
您可以使用 numpy.digitize()
bins = [0,13,18,25,35,60,100]
df['AgeC'] =numpy.digitize(df['Age'],bins)