geom_tile 基于聚类结果

Question

我正在使用 geom_tile() 制作热图。我想根据聚类对y轴(sp)进行排序（实际数据大约有200条sp记录）

 sp <- c("sp1","sp1","sp1","sp2","sp2","sp2","sp3","sp3","sp3","sp4","sp4","sp4","sp5","sp5","sp5")
 category <- c("a","b","c","a","b","c","a","b","c","a","b","c","a","b","c")
 count <- c(1,2,1,1,4,2,3,1,3,1,4,5,2,5,1)
 d <- data.frame(cbind(sp, category, count))
 
 t <- d %>%
    ggplot(aes(category, sp))+
    geom_tile(aes(fill = as.numeric(count)))+
         scale_fill_gradient(low = "white", high = "red")

 plot(t)

Answer 1

这是一个使用经典 hclust 方法的示例：

library(ggplot2)

sp <- c("sp1","sp1","sp1","sp2","sp2","sp2","sp3","sp3","sp3","sp4","sp4","sp4","sp5","sp5","sp5")
category <- c("a","b","c","a","b","c","a","b","c","a","b","c","a","b","c")
count <- c(1,2,1,1,4,2,3,1,3,1,4,5,2,5,1)
d <- data.frame(cbind.data.frame(sp, category, count))

# Reshape data as matrix
m <- tidyr::pivot_wider(d, names_from = "sp", values_from = "count")
m <- as.matrix(m[, -1]) # -1 to omit categories from matrix

# Cluster based on euclidean distance
clust <- hclust(dist(t(m)))

# Set explicit y-axis limits
ggplot(d, aes(category, sp))+
  geom_tile(aes(fill = as.numeric(count)))+
  scale_fill_gradient(low = "white", high = "red") +
  scale_y_discrete(limits = colnames(m)[clust$order])

^{由 reprex package (v1.0.0)}

于 2021-06-24 创建

geom_tile 基于聚类结果

geom_tile based on results of clustering

r

cluster-analysis

ggplot2

geom-tile