如何在小提琴图上显示胡须和点?
How to show whiskers and points on violin plots?
我有一个包含以下数据的数据框 df
。我想绘制两组 A 和 B 之间基因的 logCPM
表达。
Samples Type GeneA
Sample1 B 14.82995162
Sample2 B 12.90512275
Sample3 B 9.196524783
Sample4 A 19.42866012
Sample5 A 19.70386922
Sample6 A 16.22906914
Sample7 A 12.48966785
Sample8 B 15.53280377
Sample9 A 9.345795955
Sample10 B 9.196524783
Sample11 B 9.196524783
Sample12 B 9.196524783
Sample13 A 9.434355615
Sample14 A 15.27604692
Sample15 A 18.90867329
Sample16 B 11.71503095
Sample17 B 13.7632545
Sample18 A 9.793864295
Sample19 B 9.196524783
Sample20 A 14.52562066
Sample21 A 13.85116605
Sample22 A 9.958492229
Sample23 A 17.57075876
Sample24 B 13.04499079
Sample25 B 15.33577937
Sample26 A 13.95849295
Sample27 B 9.196524783
Sample28 A 18.20524388
Sample29 B 17.7058873
Sample30 B 14.0199393
Sample31 A 16.21499069
Sample32 A 14.171432
Sample33 B 9.196524783
Sample34 B 9.196524783
Sample35 B 15.16648035
Sample36 B 12.9435081
Sample37 B 13.81971106
Sample38 B 15.82901231
我尝试使用 ggviolin
制作小提琴情节。
library("ggpubr")
pdf("eg.pdf", width = 5, height = 5)
p <- ggviolin(df, x = "Type", y = "GeneA", fill = "Type",
color = "Type", palette = c("#00AFBB", "#FC4E07"),
add="boxplot",add.params = list(fill="white"),
order = c("A", "B"),
ylab = "GeneA (logCPM)", xlab = "Groups")
ggpar(p, ylim = c(5,25))
dev.off()
我得到了这样的小提琴情节。
1) 在这里我没有看到小提琴上的任何胡须和任何点。
2)有没有办法显示哪个点是哪个样本?喜欢给点不同的颜色(例如:我对样本 10 感兴趣。我想给那个点不同的颜色,因为我有兴趣看到它的表达)
谢谢
我可以建议改用 elephant
/raincloud
or 图吗?
来自上面链接的博客 post:
Violin plots mirror the data density in a totally uninteresting/uninformative way, simply repeating the same exact information for the sake of visual aesthetic.
In raincloud plot, we get basically everything we need: eyeballed statistical inference, assessment of data distributions (useful to check assumptions), and the raw data itself showing outliers and underlying patterns.
library(tidyverse)
library(ggrepel)
df <- read_table2(txt)
# create new variable for coloring & labeling `Sample10` pts
df <- df %>%
mutate(colSel = ifelse(Samples == 'Sample10', '#10', 'dummy'),
labSel = ifelse(Samples == 'Sample10', '#10', ''))
# create summary statistics
sumld <- df %>%
group_by(Type) %>%
summarise(
mean = mean(GeneA, na.rm = TRUE),
median = median(GeneA, na.rm = TRUE),
sd = sd(GeneA, na.rm = TRUE),
N = n(),
ci = 1.96 * sd/sqrt(N),
lower95 = mean - ci,
upper95 = mean + ci,
lower = mean - sd,
upper = mean + sd) %>%
ungroup()
sumld
#> # A tibble: 2 x 10
#> Type mean median sd N ci lower95 upper95 lower upper
#> <chr> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 A 14.7 14.5 3.54 17 1.68 13.0 16.3 11.1 18.2
#> 2 B 12.4 12.9 2.85 21 1.22 11.2 13.6 9.54 15.2
雨云图
## get geom_flat_violin function
## https://gist.github.com/benmarwick/b7dc863d53e0eabc272f4aad909773d2
## mirror: https://pastebin.com/J9AzSxtF
devtools::source_gist("2a1bb0133ff568cbe28d", filename = "geom_flat_violin.R")
pos <- position_jitter(width = 0.15, seed = 1)
p0 <- ggplot(data = df, aes(x = Type, y = GeneA, fill = Type)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), alpha = .8) +
guides(fill = FALSE) +
guides(color = FALSE) +
scale_color_brewer(palette = "Dark2") +
scale_fill_brewer(palette = "Dark2") +
theme_classic()
# raincloud plot
p1 <- p0 +
geom_point(aes(color = Type),
position = pos, size = 3, alpha = 0.8) +
geom_boxplot(width = .1, show.legend = FALSE, outlier.shape = NA, alpha = 0.5)
p1
# coloring Sample10
p0 +
geom_point(aes(color = colSel),
position = pos, size = 3, alpha = 0.8) +
geom_text_repel(aes(label = labSel),
point.padding = 0.25,
direction = 'y',
position = pos) +
geom_boxplot(width = .1, show.legend = FALSE, outlier.shape = NA, alpha = 0.5) +
scale_color_manual(values = c('dummy' = 'grey50', '#10' = 'red'))
# errorbar instead of boxplot
p0 +
geom_point(aes(color = colSel),
position = pos, size = 3, alpha = 0.8) +
geom_point(data = sumld, aes(x = Type, y = mean),
position = position_nudge(x = 0.3), size = 3.5) +
geom_text_repel(aes(label = labSel),
point.padding = 0.25,
direction = 'y',
position = pos) +
geom_errorbar(data = sumld, aes(ymin = lower95, ymax = upper95, y = mean),
position = position_nudge(x = 0.3), width = 0) +
guides(fill = FALSE) +
guides(color = FALSE) +
scale_color_manual(values = c('dummy' = 'grey50', '#10' = 'red')) +
scale_fill_brewer(palette = "Dark2") +
theme_classic()
混合箱线图使用ggpol
包
中的geom_boxjitter()
##
library(ggpol)
half_box <- ggplot(df) + geom_boxjitter(aes(x = Type, y = GeneA,
fill = Type, color = Type),
jitter.shape = 21, jitter.color = NA,
jitter.height = 0, jitter.width = 0.04,
outlier.color = NA, errorbar.draw = TRUE) +
scale_color_brewer(palette = "Dark2") +
scale_fill_brewer(palette = "Dark2") +
theme_classic()
half_box
奖励:您还可以将 geom_point()
替换为 geom_quasirandom()
,其中 ggbeeswarm package. Here 是一个示例。
.
.
.
由 reprex package (v0.2.1.9000)
创建于 2018-10-03
我有一个包含以下数据的数据框 df
。我想绘制两组 A 和 B 之间基因的 logCPM
表达。
Samples Type GeneA
Sample1 B 14.82995162
Sample2 B 12.90512275
Sample3 B 9.196524783
Sample4 A 19.42866012
Sample5 A 19.70386922
Sample6 A 16.22906914
Sample7 A 12.48966785
Sample8 B 15.53280377
Sample9 A 9.345795955
Sample10 B 9.196524783
Sample11 B 9.196524783
Sample12 B 9.196524783
Sample13 A 9.434355615
Sample14 A 15.27604692
Sample15 A 18.90867329
Sample16 B 11.71503095
Sample17 B 13.7632545
Sample18 A 9.793864295
Sample19 B 9.196524783
Sample20 A 14.52562066
Sample21 A 13.85116605
Sample22 A 9.958492229
Sample23 A 17.57075876
Sample24 B 13.04499079
Sample25 B 15.33577937
Sample26 A 13.95849295
Sample27 B 9.196524783
Sample28 A 18.20524388
Sample29 B 17.7058873
Sample30 B 14.0199393
Sample31 A 16.21499069
Sample32 A 14.171432
Sample33 B 9.196524783
Sample34 B 9.196524783
Sample35 B 15.16648035
Sample36 B 12.9435081
Sample37 B 13.81971106
Sample38 B 15.82901231
我尝试使用 ggviolin
制作小提琴情节。
library("ggpubr")
pdf("eg.pdf", width = 5, height = 5)
p <- ggviolin(df, x = "Type", y = "GeneA", fill = "Type",
color = "Type", palette = c("#00AFBB", "#FC4E07"),
add="boxplot",add.params = list(fill="white"),
order = c("A", "B"),
ylab = "GeneA (logCPM)", xlab = "Groups")
ggpar(p, ylim = c(5,25))
dev.off()
我得到了这样的小提琴情节
1) 在这里我没有看到小提琴上的任何胡须和任何点。
2)有没有办法显示哪个点是哪个样本?喜欢给点不同的颜色(例如:我对样本 10 感兴趣。我想给那个点不同的颜色,因为我有兴趣看到它的表达)
谢谢
我可以建议改用 elephant
/raincloud
or
来自上面链接的博客 post:
Violin plots mirror the data density in a totally uninteresting/uninformative way, simply repeating the same exact information for the sake of visual aesthetic.
In raincloud plot, we get basically everything we need: eyeballed statistical inference, assessment of data distributions (useful to check assumptions), and the raw data itself showing outliers and underlying patterns.
library(tidyverse)
library(ggrepel)
df <- read_table2(txt)
# create new variable for coloring & labeling `Sample10` pts
df <- df %>%
mutate(colSel = ifelse(Samples == 'Sample10', '#10', 'dummy'),
labSel = ifelse(Samples == 'Sample10', '#10', ''))
# create summary statistics
sumld <- df %>%
group_by(Type) %>%
summarise(
mean = mean(GeneA, na.rm = TRUE),
median = median(GeneA, na.rm = TRUE),
sd = sd(GeneA, na.rm = TRUE),
N = n(),
ci = 1.96 * sd/sqrt(N),
lower95 = mean - ci,
upper95 = mean + ci,
lower = mean - sd,
upper = mean + sd) %>%
ungroup()
sumld
#> # A tibble: 2 x 10
#> Type mean median sd N ci lower95 upper95 lower upper
#> <chr> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 A 14.7 14.5 3.54 17 1.68 13.0 16.3 11.1 18.2
#> 2 B 12.4 12.9 2.85 21 1.22 11.2 13.6 9.54 15.2
雨云图
## get geom_flat_violin function
## https://gist.github.com/benmarwick/b7dc863d53e0eabc272f4aad909773d2
## mirror: https://pastebin.com/J9AzSxtF
devtools::source_gist("2a1bb0133ff568cbe28d", filename = "geom_flat_violin.R")
pos <- position_jitter(width = 0.15, seed = 1)
p0 <- ggplot(data = df, aes(x = Type, y = GeneA, fill = Type)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), alpha = .8) +
guides(fill = FALSE) +
guides(color = FALSE) +
scale_color_brewer(palette = "Dark2") +
scale_fill_brewer(palette = "Dark2") +
theme_classic()
# raincloud plot
p1 <- p0 +
geom_point(aes(color = Type),
position = pos, size = 3, alpha = 0.8) +
geom_boxplot(width = .1, show.legend = FALSE, outlier.shape = NA, alpha = 0.5)
p1
# coloring Sample10
p0 +
geom_point(aes(color = colSel),
position = pos, size = 3, alpha = 0.8) +
geom_text_repel(aes(label = labSel),
point.padding = 0.25,
direction = 'y',
position = pos) +
geom_boxplot(width = .1, show.legend = FALSE, outlier.shape = NA, alpha = 0.5) +
scale_color_manual(values = c('dummy' = 'grey50', '#10' = 'red'))
# errorbar instead of boxplot
p0 +
geom_point(aes(color = colSel),
position = pos, size = 3, alpha = 0.8) +
geom_point(data = sumld, aes(x = Type, y = mean),
position = position_nudge(x = 0.3), size = 3.5) +
geom_text_repel(aes(label = labSel),
point.padding = 0.25,
direction = 'y',
position = pos) +
geom_errorbar(data = sumld, aes(ymin = lower95, ymax = upper95, y = mean),
position = position_nudge(x = 0.3), width = 0) +
guides(fill = FALSE) +
guides(color = FALSE) +
scale_color_manual(values = c('dummy' = 'grey50', '#10' = 'red')) +
scale_fill_brewer(palette = "Dark2") +
theme_classic()
混合箱线图使用ggpol
包
geom_boxjitter()
##
library(ggpol)
half_box <- ggplot(df) + geom_boxjitter(aes(x = Type, y = GeneA,
fill = Type, color = Type),
jitter.shape = 21, jitter.color = NA,
jitter.height = 0, jitter.width = 0.04,
outlier.color = NA, errorbar.draw = TRUE) +
scale_color_brewer(palette = "Dark2") +
scale_fill_brewer(palette = "Dark2") +
theme_classic()
half_box
奖励:您还可以将 geom_point()
替换为 geom_quasirandom()
,其中 ggbeeswarm package. Here 是一个示例。
.
.
.
由 reprex package (v0.2.1.9000)