如何使用 pandas 的 loc?

How to use loc from pandas?

我有这段代码可以将年龄从数字数据替换为分类数据。我正在尝试那样做,但它不起作用。有人可以帮助我吗?

for df in treino_teste:
    df.loc[df['Age'] <= 13, 'Age'] = 0,
    df.loc[(df['Age'] > 13) & (df['Age'] <= 18), 'Age'] = 1,
    df.loc[(df['Age'] > 18) & (df['Age'] <= 25), 'Age'] = 2,
    df.loc[(df['Age'] > 25) & (df['Age'] <= 35), 'Age'] = 3,
    df.loc[(df['Age'] > 35) & (df['Age'] <= 60), 'Age'] = 4,
    df.loc[df['Age'] > 60, 'Age'] = 5

错误:

  • 可以对连续数据进行分类
  • 例如,我已将 bin 分配给一个新列。我本可以将它分配回 Age
  • 为了方便阅读结果我已经排序了,这个不需要
df = pd.DataFrame({"Age":np.random.randint(1,65,10)}).sort_values(["Age"])

bins = [0,13,18,25,35,60,100]
df.assign(AgeB=pd.cut(df.Age, bins=bins, labels=[i for i,v in enumerate(bins[:-1])]))

Age AgeB
5 12 0
3 13 0
8 18 1
7 25 2
9 25 2
1 27 3
2 30 3
4 57 4
0 59 4
6 64 5

您可以使用 numpy.digitize()

bins = [0,13,18,25,35,60,100]
df['AgeC'] =numpy.digitize(df['Age'],bins)