R层次聚类可视化分类而不对其进行聚类

Question

我一直在研究一些使用层次聚类的数据集。在这些数据集中，有一定数量的变量我想用来聚类数据，然后还有其他分类变量我不想聚类但仍想可视化。

我想做的是找到一种方法 "add a tier" 到聚类算法生成的热图，我可以在其中查看二进制分类（红色表示 1，蓝色表示 0），而无需实际聚类这个数据。这样，我就可以评估我的分类响应通过聚类组合在一起的效果如何。

这是一个简化的例子：

library("gplots")
set.seed(1)

## creating random data to input into hierarchial clustering algorithm
data <- matrix(rexp(100, rate = 0.1), ncol = 10) 
colnames(data) <- c("var1", "var2", "var3", "var4", "var5", "var6", 
    "var7", "var8", "var9", "var10")

# these are the two classification labels for each data point
classification1 <- c(1, 1, 0, 1, 1, 0, 0, 0, 1, 1)  

# I want to visualize how well the clustering algorithm groups
# the data correlates with the classifications without
# clustering on these classifications
classification2 <- c(0, 0, 0, 0, 0, 1, 1, 1, 1, 0)  

par(mar = c(1, 4.5, 0.1, 0.1))
matrix = rbind(c(1, 2), c(3, 4), c(5, 6))
wid = c(1, 1)
hei = c(0.5, 10)
hclustfunc <- function(x) hclust(x, method = "complete")
distfunc <- function(x) dist(x, method = "euclidean")
my_palette <- colorRampPalette(c("yellow", "orange", "darkorange", 
    "red", "darkred"))(n = 1000)
heatmap.2(as.matrix(data), dendrogram = "row", trace = "none", 
    margin = c(8, 9), hclust = hclustfunc, distfun = distfunc, 
    col = my_palette, key = FALSE, key.xlab = "", key.title = "Clustering Algorithm", 
    key.ylab = "", keysize = 1.25, density.info = "density", 
    lhei = hei)

这生成的热图为我提供了很多信息。我现在想做的是在聚类算法不用于聚类的热图右侧再追加两列。

这两列将是 "classification 1" 和 "classification 2" 的二进制标签（红色单元格代表 1，蓝色单元格代表 0）。我只想可视化这些分类响应在树状图中的组合情况。

Answer 1

如果您只有一个分类要添加，您可以只使用 heatmap.2 和 RowSideColors 选项。但是，如果您要添加多个分类，您将使用 heatmap.plus。这些选项与 heatmap 和 heatmap.2 略有不同，但对您的问题来说重要的是 RowSideColors 选项采用矩阵。

library(heatmap.plus)

class1_cols <- c('red', 'blue')[classification1+1]
class2_cols <- c('red','blue')[classification2+1]

anno <- data.frame(class1 = class1_cols, clas2 = class2_cols)

heatmap.plus(as.matrix(data), col  = my_palette, 
     RowSideColors = as.matrix(anno))

R层次聚类可视化分类而不对其进行聚类

R hierarchical clustering visualizing classifications without clustering on them

r

hierarchical-clustering

heatmap