R:geom_logo 来自 tibble 的多个基因

R: geom_logo for multiple genes from tibble

我对 ggseqlogo 包不太熟悉,所以我很感激任何形式的帮助。

我准备了如下小标题:

test <- tibble( gene = c("A", "B", "A", "C", "B"),
                seq = c("AAAAAAAAAAAAAAAAAAAA",
                        "GGGGGGGGGGGGGGGG",
                        "AAAAAATAAAAATAAAAAAA",
                        "AGTCGTCATGCATCAATCCCAATGGTGCA",
                        "GGGGGGGCCGGGGGGG") ) 

我想根据基因名称为每个基因准备 seqlogo。每个基因序列具有相同的长度。

到目前为止我试过这个:

ggplot() + 
 geom_logo(data = test$gene) +
 facet_grid(rows = ~ gene)

但到目前为止,这是我得到的最好的:

响应可能会迟到,但有效。

ggseqlogo 有分面选项,但这需要唯一的基因名称和相同大小的序列。我通过创建一个循环并将所有基因的图存储在一个列表中来避免这个问题。

然后可以通过 cowplots 安排该情节列表 plot_grid

library(tidyverse)
library(cowplot)
library(ggseqlogo)


test <- tibble( gene = c("A", "B", "A", "C", "B"),
                seq = c("AAAAAAAAAAAAAAAAAAAA",
                        "GGGGGGGGGGGGGGGG",
                        "AAAAAATAAAAATAAAAAAA",
                        "AGTCGTCATGCATCAATCCCAATGGTGCA",
                        "GGGGGGGCCGGGGGGG") ) 

# Initialize list to store plots in
plot_list <- list()

# Loop through all genes and 
# store the resulting plots in the plotlist
for(i in 1:nrow(test)) {
  plot_list[[i]] <- ggplot() + 
    geom_logo(data = test[i,2],  seq_type = "dna") +
    ggtitle(paste0( test[i,1]))
}

# Cowplots can arrange the list based on your desire
plot_grid(plotlist =  plot_list, ncol = 2)