无法为一维数据绘制 K-Means 聚类

Cannot plot K-Means clusters for one-dimensional data

我正在尝试在我的二元分类任务中实施 K-Means 算法,但我无法绘制生成的两个聚类的散点图。

我的数据集只是以下形式:

# size, class
  312,  1
  319   1
  227   0       

最小的例子:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.cluster         import KMeans

X = {'size': [312,319,227,301,273,311,277,291,303,381], 'class': [1,1,0,1,0,1,0,0,1,1]}
X = pd.DataFrame(data=X)
X_train, X_test, y_train, y_test = train_test_split(X['size'], X['class'], test_size=0.4)
X_train = X_train.values.reshape(-1,1)
X_test  = X_test.values.reshape(-1,1)

kmeans = KMeans(init="k-means++", n_clusters=2, n_init=10, max_iter=300, random_state=42)

kmeans.fit(X_train)
preds = kmeans.predict(X_test)

如何根据预测“preds”绘制显示两个聚类、“X_test”中的样本和相应颜色(0 和 1)的散点图?

因为你只有一个功能,所以你所有的数据都在一条线上。您可以像这样创建散点图:

color = ["blue", "red"]
plt.scatter(X_test.flatten(), [0]*len(X_test), c=[color[p] for p in preds])

如果你想有两个特征,你可以修改你的数据:

X = {
    'size_1': [312,319,227,301,273,311,277,291,303,381],
    'size_2': [152,165,301,145,310,145,315,156,160,165],
    'class': [1,1,0,1,0,1,0,0,1,1],
}
X = pd.DataFrame(data=X)
X_train, X_test, y_train, y_test = train_test_split(X[['size_1', 'size_2']], X['class'], test_size=0.4)

然后你修改散点图:

plt.scatter(X_test.iloc[:,0],X_test.iloc[:,1], c=[color[p] for p in preds])