R 中基因强度值的热图

Heatmap of Gene intensity values in R

我有这样的数据:

Gene HBEC-KT-01 HBEC-KT-02 HBEC-KT-03 HBEC-KT-04 HBEC-KT-05 Primarycells-02 Primarycells-03 Primarycells-04 Primarycells-05
BPIFB1 15726000000 15294000000 15294000000 14741000000 22427000000 87308000000 2.00E+11 1.04E+11 1.51E+11
LCN2 18040000000 26444000000 28869000000 30337000000 10966000000 62388000000 54007000000 56797000000 38414000000
C3 2.52E+11 2.26E+11 1.80E+11 1.80E+11 1.78E+11 46480000000 1.16E+11 69398000000 78766000000
MUC5AC 15647000 8353200 12617000 12221000 29908000 40893000000 79830000000 28130000000 69147000000
MUC5B 965190000 693910000 779970000 716110000 1479700000 38979000000 90175000000 41764000000 50535000000
ANXA2 14705000000 18721000000 21592000000 18904000000 22657000000 28163000000 24282000000 21708000000 16528000000

我想使用 R 制作如下所示的热图。我正在关注一篇论文,他们引用了“热图是使用‘pheatmap’包76生成的,其中应用了相关聚类距离行”。这是他们的热图。

我想要这样的东西,我正在尝试按照教程使用 R 制作一个,但我是 R 语言的新手,对 R 一无所知。

这是我的代码。

df <- read.delim("R.txt", header=T, row.names="Gene")
df_matrix <- data.matrix(df)
pheatmap(df_matrix, 
     main = "Heatmap of Extracellular Genes",
     color = colorRampPalette(rev(brewer.pal(n = 10, name = "RdYlBu")))(10),
     cluster_cols = FALSE,
     show_rownames = F,
     fontsize_col = 10,
     cellwidth = 40,
     )

这就是我得到的。

当我尝试使用集群时,出现错误。

pheatmap(
mat = df_matrix,
  scale = "row",
  cluster_column = F,
  show_rownames = TRUE,
  drop_levels = TRUE,
  fontsize = 5,
  clustering_method = "complete",
  main = "Hierachical Cluster Analysis"
)

Error in hclust(d, method = method) : 
NA/NaN/Inf in foreign function call (arg 10)

有人可以帮我写代码吗?

您可以使用 scale 对数据进行归一化,以获得更均匀的着色。此处,每个样本的平均表达式设置为 0。表达低于平均值的基因具有负 z 分数:

library(tidyverse)
library(pheatmap)

data <- tribble(
  ~Gene, ~`HBEC-KT-01`, ~`HBEC-KT-02`, ~`HBEC-KT-03`, ~`HBEC-KT-04`, ~`HBEC-KT-05`, ~`Primarycells-03`, ~`Primarycells-04`, ~`Primarycells-05`,
  "BPIFB1", 1.5726e+10, 1.5294e+10, 1.5294e+10, 1.4741e+10, 2.2427e+10, 2e+11, 1.04e+11, 1.51e+11,
  "LCN2", 1.804e+10, 2.6444e+10, 2.8869e+10, 3.0337e+10, 1.0966e+10, 5.4007e+10, 5.6797e+10, 3.8414e+10,
  "C3", 2.52e+11, 2.26e+11, 1.8e+11, 1.8e+11, 1.78e+11, 1.16e+11, 6.9398e+10, 7.8766e+10,
  "MUC5AC", 15647000, 8353200, 12617000, 12221000, 29908000, 7.983e+10, 2.813e+10, 6.9147e+10,
  "MUC5B", 965190000, 693910000, 779970000, 716110000, 1479700000, 9.0175e+10, 4.1764e+10, 5.0535e+10,
  "ANXA2", 1.4705e+10, 1.8721e+10, 2.1592e+10, 1.8904e+10, 2.2657e+10, 2.4282e+10, 2.1708e+10, 1.6528e+10
)
data %>%
  mutate(across(where(is.numeric), scale)) %>%
  column_to_rownames("Gene") %>%
  pheatmap(
    scale = "row",
    cluster_column = F,
    show_rownames = FALSE,
    show_colnames = TRUE,
    treeheight_col = 0,
    drop_levels = TRUE,
    fontsize = 5,
    clustering_method = "complete",
    main = "Hierachical Cluster Analysis (z-score)",
  )

reprex package (v2.0.1)

于 2021-09-26 创建