R 中基因强度值的热图

Question

我有这样的数据：

Gene	HBEC-KT-01	HBEC-KT-02	HBEC-KT-03	HBEC-KT-04	HBEC-KT-05	Primarycells-02	Primarycells-03	Primarycells-04	Primarycells-05
BPIFB1	15726000000	15294000000	15294000000	14741000000	22427000000	87308000000	2.00E+11	1.04E+11	1.51E+11
LCN2	18040000000	26444000000	28869000000	30337000000	10966000000	62388000000	54007000000	56797000000	38414000000
C3	2.52E+11	2.26E+11	1.80E+11	1.80E+11	1.78E+11	46480000000	1.16E+11	69398000000	78766000000
MUC5AC	15647000	8353200	12617000	12221000	29908000	40893000000	79830000000	28130000000	69147000000
MUC5B	965190000	693910000	779970000	716110000	1479700000	38979000000	90175000000	41764000000	50535000000
ANXA2	14705000000	18721000000	21592000000	18904000000	22657000000	28163000000	24282000000	21708000000	16528000000

我想使用 R 制作如下所示的热图。我正在关注一篇论文，他们引用了“热图是使用‘pheatmap’包76生成的，其中应用了相关聚类距离行”。这是他们的热图。

我想要这样的东西，我正在尝试按照教程使用 R 制作一个，但我是 R 语言的新手，对 R 一无所知。

这是我的代码。

df <- read.delim("R.txt", header=T, row.names="Gene")
df_matrix <- data.matrix(df)
pheatmap(df_matrix, 
     main = "Heatmap of Extracellular Genes",
     color = colorRampPalette(rev(brewer.pal(n = 10, name = "RdYlBu")))(10),
     cluster_cols = FALSE,
     show_rownames = F,
     fontsize_col = 10,
     cellwidth = 40,
     )

这就是我得到的。

当我尝试使用集群时，出现错误。

pheatmap(
mat = df_matrix,
  scale = "row",
  cluster_column = F,
  show_rownames = TRUE,
  drop_levels = TRUE,
  fontsize = 5,
  clustering_method = "complete",
  main = "Hierachical Cluster Analysis"
)

Error in hclust(d, method = method) : 
NA/NaN/Inf in foreign function call (arg 10)

有人可以帮我写代码吗？

Answer 1

您可以使用 scale 对数据进行归一化，以获得更均匀的着色。此处，每个样本的平均表达式设置为 0。表达低于平均值的基因具有负 z 分数：

library(tidyverse)
library(pheatmap)

data <- tribble(
  ~Gene, ~`HBEC-KT-01`, ~`HBEC-KT-02`, ~`HBEC-KT-03`, ~`HBEC-KT-04`, ~`HBEC-KT-05`, ~`Primarycells-03`, ~`Primarycells-04`, ~`Primarycells-05`,
  "BPIFB1", 1.5726e+10, 1.5294e+10, 1.5294e+10, 1.4741e+10, 2.2427e+10, 2e+11, 1.04e+11, 1.51e+11,
  "LCN2", 1.804e+10, 2.6444e+10, 2.8869e+10, 3.0337e+10, 1.0966e+10, 5.4007e+10, 5.6797e+10, 3.8414e+10,
  "C3", 2.52e+11, 2.26e+11, 1.8e+11, 1.8e+11, 1.78e+11, 1.16e+11, 6.9398e+10, 7.8766e+10,
  "MUC5AC", 15647000, 8353200, 12617000, 12221000, 29908000, 7.983e+10, 2.813e+10, 6.9147e+10,
  "MUC5B", 965190000, 693910000, 779970000, 716110000, 1479700000, 9.0175e+10, 4.1764e+10, 5.0535e+10,
  "ANXA2", 1.4705e+10, 1.8721e+10, 2.1592e+10, 1.8904e+10, 2.2657e+10, 2.4282e+10, 2.1708e+10, 1.6528e+10
)
data %>%
  mutate(across(where(is.numeric), scale)) %>%
  column_to_rownames("Gene") %>%
  pheatmap(
    scale = "row",
    cluster_column = F,
    show_rownames = FALSE,
    show_colnames = TRUE,
    treeheight_col = 0,
    drop_levels = TRUE,
    fontsize = 5,
    clustering_method = "complete",
    main = "Hierachical Cluster Analysis (z-score)",
  )

^{由 reprex package (v2.0.1)}

于 2021-09-26 创建

R 中基因强度值的热图

Heatmap of Gene intensity values in R

r

heatmap

correlation

pheatmap