重叠两个基因集,找到它们的重叠意义并绘制它们

Overlapping two gene sets ,finding their overlap significance and plotting them

(Fig. 3a, b, Extended Data Fig. 3a, b and Supplementary Table 1). After 48 h, more than one-third of the transcriptome was differentially expressed (>5,000 genes; 405 genes encoding for proteins in the extracellular region, Gene Ontology (GO) accession 0005576), significantly overlapping with the gene expression changes of A375 tumours in vivo after 5 days of vemurafenib treatment (Fig. 3a, b and Extended Data Fig. 3c). Similar extensive gene expression changes were observed in Colo800 and UACC62 melanoma cells treated with vemurafenib and H3122 lung adenocarcinoma cells treated with crizotinib (Extended Data Fig. 3d). Despite different cell lineages, different oncogenic drivers, and different targeted therapies we observed a significant overlap between the secretome of melanoma and lung adenocarcinoma cells (P < 9.11 × 10−5)

paper

我希望看到类似于 图 f 的图,其中显示了交叉点和显着性重叠。为了实现这一点,我让这段代码一直工作到交集部分,但我不知道如何 运行 重要部分。

library(reshape2)
library(venneuler)
RNA_seq_cds <- read.csv("~/Downloads/RNA_seq_gene_set.txt", header=TRUE, sep="\t")
head(RNA_seq_cds)
ATAC_seq <- read.csv("~/Downloads/ATAC_seq_gene_set.txt", header=TRUE, sep="\t")
head(ATAC_seq)
RNA_seq <- RNA_seq_cds
ATAC_seq <- ATAC_seq

#
cbindPad <- function(...) {
  args <- list(...)
  n <- sapply(args, nrow)
  mx <- max(n)
  pad <- function(x, mx) {
    if (nrow(x) < mx) {
      nms <- colnames(x)
      padTemp <- matrix(NA, mx - nrow(x), ncol(x))
      colnames(padTemp) <- nms
      if (ncol(x) == 0) {
        return(padTemp)
      } else {
        return(rbind(x, padTemp))
      }
    } else {
      return(x)
    }
  }
  rs <- lapply(args, pad, mx)
  return(do.call(cbind, rs))
}

dat <- cbindPad(ATAC_seq, RNA_seq)

vennfun <- function(x) { 
  x$id <- seq(1, nrow(x))  #add a column of numbers (required for melt)
  xm <- melt(x, id.vars="id", na.rm=TRUE)  #melt table into two columns (value & variable)
  xc <- dcast(xm, value~variable, fun.aggregate=length)  #remove NA's, list presence/absence of each value for each variable (1 or 0)
  rownames(xc) <- xc$value  #value column=rownames (required for Venneuler)
  xc$value <- NULL  #remove redundent value column
  xc  #output the new dataframe
}

#
VennDat <- vennfun(dat)
genes.venn <- venneuler(VennDat)
genes.venn$labels <- c("RNA", "\nATAC" )
# plot(genes.venn, cex =15, )
#
#https://rstudio-pubs-static.s3.amazonaws.com/13301_6641d73cfac741a59c0a851feb99e98b.html   
vd <- venneuler(VennDat)
vd$labels <- paste(genes.venn$labels, colSums(VennDat))
plot(vd, cex=10)
text(.3, .45, 
     bquote(bold("Common ="~.(as.character(sum(rowSums(VennDat) == 2))))), 
     col="red", cex=1)

LABS <- vd$labels

上面的代码给出了交叉图

现在 重要性 部分我如何在两个基因集之间做到这一点,并如原图所示显示它。

我用来生成上述图的 data

如有任何建议或帮助,我们将不胜感激。

如果您谈论如何在图形下方放置任何文本,只需像以前一样使用 'text'。这只是对 x=y= 坐标的一些猜测。 xpd=TRUE 允许您绘制边距。

VennDat <- vennfun(dat)
vd <- venneuler(VennDat)
vd$labels <- paste(c("RNA", "ATAC"), colSums(VennDat))

plot(vd, cex=10, border=c(NA, 'red'), col=c('#6b65af', '#ad7261'))
text(x=.5, y=.5, sum(rowSums(VennDat) == 2), xpd=TRUE)
text(.5, .15, 'overlap\n', xpd=TRUE)
text(.5, .13, bquote(italic(p)*'< 9.11E-55'), xpd=TRUE)

我也调整了plot的一些参数。您可以使用以下方法检查绘图方法的代码:

venneuler:::plot.VennDiagram

如果您想知道重要性是如何计算的,您应该 post 您的问题 Cross Validated