k 表示用 r 中的原始数据作图

Question

目前我正在探索 kmeans 函数。我有一个包含以下条目的简单文本文件 (test.txt)。数据可以分成 2 个集群。

如何绘制 kmeans 函数（使用 plot 函数）的结果以及原始数据？我也有兴趣观察簇及其质心的分布情况？

Answer 1

这是来自 example(kmeans) 的示例：

# This is just to generate example data
test <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
           matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
colnames(test) <- c("V1", "V2")

#store the kmeans in a variable called cl
(cl <- kmeans(test, 2))

# plot it and also plot the points of the centeroids
plot(test, col = cl$cluster)
points(cl$centers, col = 1:2, pch = 8, cex = 2)

编辑

OP 还有一些问题：

(cl <- kmeans(test, 2))
plot(test, col = cl$cluster)
points(cl$centers, col = 1:2, pch = 8, cex = 2)

以上代码导致：

(cl <- kmeans(test[,1], 2))
plot(test[,1], col = cl$cluster)
points(cl$centers, col = 1:2, pch = 8, cex = 2)

以上代码导致：

(cl <- kmeans(test[,1], 2))
plot(cbind(0,test[,1]),  col = cl$cluster)
points(cbind(0,cl$centers), col = 1:2, pch = 8, cex = 2)

以上代码导致：

解释

在案例 1 中，数据有两个维度（V1、V2），因此质心有两个坐标，就像图中的点一样。在情况 2 中，数据就像您的数据一样是一维的 (V1)。 R 给每个点一个索引，这导致 x 值成为索引值，质心也只有一个坐标，这就是为什么你一直看到它们在图的左边。情况 3 是一维数据实际的样子，如果你只在一个维度上绘制它。

结论

你的数据是一维的，如果你在二维中绘制它，你会得到类似情况二的结果，其中 x 值由 R 给出，它们是索引值。像那样绘制它没有多大意义。

k 表示用 r 中的原始数据作图

k means plot with the original data in r

plot

r

k-means