着色簇

Coloring clusters

我正在使用以下代码执行 SOM(自组织映射,也称为 Kohonen 网络)机器学习算法来可视化一些数据。然后,我在可视化上使用聚类算法(我 select 8 个簇):

#load library
library(tidyverse)
library(kohonen)
library(GGally)
library(purrr)
library(tidyr)
library(dplyr)
library(mlr)

#load data
data(flea)
fleaTib <- as_tibble(flea)

#define SOM grid
somGrid <- somgrid(xdim = 5, ydim = 5, topo = "hexagonal",
neighbourhood.fct = "bubble", toroidal = FALSE)

#format data
fleaScaled <- fleaTib %>%
select(-species) %>%
scale()

#perform som
fleaSom <- som(fleaScaled, grid = somGrid, rlen = 5000,
alpha = c(0.05, 0.01))

par(mfrow = c(2, 3))
plotTypes <- c("codes", "changes", "counts", "quality",
"dist.neighbours", "mapping")
walk(plotTypes, ~plot(fleaSom, type = ., shape = "straight"))

getCodes(fleaSom) %>%
as_tibble() %>%
iwalk(~plot(fleaSom, type = "property", property = .,
main = .y, shape = "straight"))

# listing flea species on SOM

par(mfrow = c(1, 2))
nodeCols <- c("cyan3", "yellow", "purple", "red", "blue", "green", "white", "pink")
plot(fleaSom, type = "mapping", pch = 21,
bg = nodeCols[as.numeric(fleaTib$species)],
shape = "straight", bgcol = "lightgrey")

# CLUSTER AND ADD TO SOM MAP ---- (8 clusters)
clusters <- cutree(hclust(dist(fleaSom$codes[[1]], 
                               method = "manhattan")), 8)

somClusters <- map_dbl(clusters, ~{
    if(. == 1) 3
    else if(. == 2) 2
    else 1
}
)


plot(fleaSom, type = "mapping", pch = 21, 
     bg = nodeCols[as.numeric(fleaTib$species)],
     shape = "straight",
     bgcol = nodeCols[as.integer(somClusters)])

add.cluster.boundaries(fleaSom, somClusters)

但在上图中,只显示了 3 种颜色,而不是 8 种。

有人可以告诉我我做错了什么吗?

将最后一个图中背景颜色的定义中的somClusters替换为clusters。主要问题是您将 somClusters 定义为具有三个值,而不是 8。如果您使用它来索引颜色向量,它将只有三种颜色。

plot(fleaSom, type = "mapping", pch = 21, 
     bg = nodeCols[as.numeric(fleaTib$species)],
     shape = "straight",
     bgcol = nodeCols[as.integer(clusters)])

add.cluster.boundaries(fleaSom, somClusters)