ggplot 中多个组的密度图

Density plot for multiple groups in ggplot

我看过example1 and How to overlay density plots in R? and Overlapped density plots in ggplot2关于如何制作密度图。我可以用第二个 link 中的代码绘制密度图。但是我想知道如何在 ggplotplotly 中制作这样的图表? 我已经查看了所有示例,但无法解决我的问题。 我有一个带有基因表达 leukemia data description 的玩具数据框,其中的列指的是 2 组个体

leukemia_big <- read.csv("http://web.stanford.edu/~hastie/CASI_files/DATA/leukemia_big.csv")

df <- data.frame(class= ifelse(grepl("^ALL", colnames(leukemia_big),
                 fixed = FALSE), "ALL", "AML"), row.names = colnames(leukemia_big))

plot(density(as.matrix(leukemia_big[,df$class=="ALL"])), 
     lwd=2, col="red")
lines(density(as.matrix(leukemia_big[,df$class=="AML"])), 
      lwd=2, col="darkgreen")

Ggplot 需要整齐的格式数据,也称为长格式数据框。 以下示例将执行此操作。但要小心,提供的数据集按患者类型具有几乎相同的值分布,因此当您绘制 ALL 和 AML 类型的患者时,曲线重叠并且您看不到差异。

library(tidyverse)

leukemia_big %>% 
as_data_frame() %>% # Optional, makes df a tibble, which makes debugging easier
gather(key = patient, value = value, 1:72) %>% #transforms a wide df into a tidy or long df
mutate(type = gsub('[.].*$','', patient)) %>% #creates a variable with the type of patient
ggplot(aes(x = value, fill = type)) + geom_density(alpha = 0.5)

在第二个示例中,我将为所有 AML 类型患者的值变量添加 1 个单位,以直观地演示重叠问题

leukemia_big %>% 
as_data_frame() %>% # Optional, makes df a tibble, which makes debugging easier
gather(key = patient, value = value, 1:72) %>% #transforms a wide df into a tidy or long df
mutate(type = gsub('[.].*$','', patient)) %>% #creates a variable with the type of patient
mutate(value2 = if_else(condition = type == "ALL", true = value, false = value + 1)) %>% # Helps demonstrate the overlapping between both type of patients
ggplot(aes(x = value2, fill = type)) + geom_density(alpha = 0.5)`