如何根据列值在 python 散点图上注释某些数据点

How to annotate certain data points on a python scatterplot based on column value

我几乎完成了我的第一个真正的交易 python 数据科学项目。但是,还有最后一件事我似乎无法弄清楚。我有以下代码为我的 PCA 和 K 均值聚类算法创建一个图:

y_axis = passers_pca_kmeans['Component 1']
x_axis = passers_pca_kmeans['Component 2']

plt.figure(figsize=(10,8))
sns.scatterplot(x_axis, y_axis, hue=passers_pca_kmeans['Segment'], palette=['g','r','c','m'])
plt.title('Clusters by PCA Components')
plt.grid(zorder=0,alpha=.4)

texts = [plt.text(x0,y0,name,ha='right',va='bottom') for x0,y0,name in zip(
    passers_pca_kmeans['Component 2'], passers_pca_kmeans['Component 1'], passers_pca_kmeans.name)]

adjust_text(texts)

plt.show

您可以将输入的布尔值切片到 text 调用中,例如:

mask = (passers_kca_means["Subject"] == "first")
x = passers_kca_means["Component 2"][mask]
y = passers_kca_means["Component 1"][mask]
names = passers_kca_means.name[mask]

texts = [plt.text(x0,y0,name,ha='right',va='bottom') for x0,y0,name in zip(x,y,names)]

您还可以通过添加 if 条件来实现不守规矩的列表理解:


x = passers_kca_means["Component 2"]
y = passers_kca_means["Component 1"]
names = passers_kca_means.name
subjects = passers_kca_means["Subject"]

texts = [plt.text(x0,y0,name,ha='right',va='bottom') for x0,y0,name,subject in zip(x,y,names,subjects) if subject == "first"]

我打赌 np.where 也有答案。