在 Pandas 中映射分箱数据

Mapping the binned data in Pandas

df12.head()
            COMPONENT_ID   PSRO       binned   PSRO_SPEED
4080  11S02CY383YH1934794910  7.470  (7.4, 7.65]  (7.4, 7.65]
4722  11S02CY388YH1934786330  7.491  (7.4, 7.65]  (7.4, 7.65]
4708  11S02CY388YH1934782718  7.497  (7.4, 7.65]  (7.4, 7.65]
4726  11S02CY388YH1934786336  7.564  (7.4, 7.65]  (7.4, 7.65]
4707  11S02CY388YH1934782709  7.581  (7.4, 7.65]  (7.4, 7.65]

我希望将分箱后的数据映射到不同的值。 我试过了

df12['PSRO_SPEED']=df12['PSRO_SPEED'].map({'(7.4,7.65]': 'high_speed'})

但这不起作用。它正在将 df12['PSRO_SPEED'] 更改为 NAN。

我认为是Interval,不是字符串,所以可能的解决方案是:

i = pd.Interval(7.4,7.65, closed='right')
df12['PSRO_SPEED']=df12['PSRO_SPEED'].map({i: 'high_speed'})

或者您的解决方案需要将列转换为字符串:

df12['PSRO_SPEED']=df12['PSRO_SPEED'].astype(str).map({'(7.4,7.65]': 'high_speed'})

但最好将参数 label 添加到 cut:

bins = [7.4,7.65,7.9,8.15,8.4,8.65] 
labels = ['lowest','low','medium','great','greatest']
df12['binned'] = pd.cut(df12['PSRO'], bins=bins, labels=labels)