在 r 中按列添加平均线组
adding average line group by column in r
嗨,我有这样的数据:
总共 38 列。治疗列中有 10 种治疗类型,日期列数据示例代码中的日期为 25-29(示例为 2 种治疗类型,但数据有 10 种类型):
df <- structure(
list(
Christensenellaceae = c(
0,
0.009910731,
0.010131195,
0.009679938,
0.01147601,
0.010484508,
0.008641566,
0.010017172,
0.010741488,
0.1,
0.2,
0.3,
0.4),
Date=c(25,25,25,25,25,27,27,27,27,27,27,27,27),
Treatment = c(
"Original Sample",
"Original Sample",
"Original Sample",
"Original Sample",
"Original Sample"
"Treatment 1",
"Treatment 1",
"Treatment 1",
"Treatment 1",
"Treatment 2",
"Treatment 2",
"Treatment 2",
"Treatment 2")
),class = "data.frame",
row.names = c(NA,-9L)
)
我想做的是为每一列创建 2 个图,一个用于原始处理,另一个用于此处示例 (1-2) 中的所有处理类型 (1-10),并添加观察的平均线基于每种治疗类型。治疗区总共应该有 10 条平均线(这里是 2 条)。遗憾的是我不明白如何添加按治疗类型分组的行
这是我基于所有治疗类型的一行代码。我如何添加按治疗类型分组的行:
df_3 %>%
pivot_longer(-treatment) %>%
mutate(plot = ifelse(str_detect(treatment, "Original"),
"Original sample",
"Treatment"),
treatment = str_extract(treatment, "\d+$")) %>%
group_by(name) %>%
group_split() %>%
map(~.x %>% ggplot(aes(x = factor(treatment), y = value, color = factor(name))) +
geom_point() +
stat_summary(aes(y = value,group=1), fun.y=mean, colour="red", geom="line",group=1)
+
facet_wrap(~plot, scales = "free_x") +
labs(x = "Treatment", y = "Value", color = "Taxa") +
guides(x = guide_axis(angle = 90))+
theme_bw())
如您所见,只有一条平均线,每种治疗类型我需要 10 条(此处为 2 条)。有什么办法可以编辑我的代码使其正常工作吗?谢谢:)
我也试过这段代码,但我似乎没有用
df %>%
pivot_longer(-c(Treatment, Date), names_to = "taxon")
%>% mutate( type = Treatment %>% str_detect("Original")
%>% ifelse("Original", "Treatment"), treatment_nr = Treatment
%>% str_extract("(?<=Treatment )[0-9]+") )
%>% ggplot(aes(Date, value, color = treatment_nr)) +
geom_point() + stat_summary( geom = "point", fun.y =
"mean", size = 3, shape = 24 ) + geom_line() + facet_grid(type
~ taxon, scales = "free_y") #> Warning: `fun.y` is deprecated.
Use `fun` instead.
您的数据格式不正确,并且与您的原始示例代码不匹配(例如 Treatment
而不是 treatment
)。无论如何,我将在这里生成一些数据,以便根据您的图像中的数据说明解决方案。
library(tidyverse)
set.seed(1)
df <-
data.frame(
Christensenellaceae = runif(105),
treatment = rep(c("Original Sample_25",
paste0("Treatment", 1:10, "_", 27),
paste0("Treatment", 1:10, "_", 28)),
each = 5)
)
因为您将平均值生成为一条线,所以它将连接在 x 轴上。我做了一个非常懒惰的工作,使用一个片段并在绘图之前计算平均值。根据您的十次治疗的外观,您可以通过更改 avg_line_length
.
来更改平均线的大小
因为该段还有额外的 x 轴值(例如 0.65、1.35),x 轴将默认包含这些额外值。我已经创建了标签和中断来解决这个问题,并且为此使用了中间数据 labs_df
。我把原来的留空了。您可以使用 color/linetype 来在图例中也将行显示为 'Mean'。
avg_line_length <- 0.35
p <-
df %>%
pivot_longer(-treatment) %>%
mutate(plot = ifelse(str_detect(treatment, "Original"),
"Original sample",
"Treatment"),
treatment = as.numeric(str_extract(treatment, "\d+")),
treatment_label = ifelse(plot %in% "Original sample", "", treatment)) %>%
{. ->> lab_df} %>%
group_by(treatment) %>%
mutate(avg = mean(value),
xstart = treatment - avg_line_length,
xend = treatment + avg_line_length) %>%
ungroup() %>%
group_by(name) %>%
group_split() %>%
map(~.x %>% ggplot() +
geom_point(aes(x = treatment, y = value, color = name)) +
geom_segment(aes(x = xstart, xend = xend, y = avg, yend = avg, color = name)) +
scale_x_continuous(breaks = lab_df$treatment, labels = lab_df$treatment_label) +
facet_wrap(~plot, scales = "free_x") +
labs(x = "Treatment", y = "Value", color = "Taxa") +
guides(x = guide_axis(angle = 90))+
theme_bw())
p
#> [[1]]
如果您不想要原始样本的平均线,只需额外 ifelse
.
p2 <-
df %>%
pivot_longer(-treatment) %>%
mutate(plot = ifelse(str_detect(treatment, "Original"),
"Original sample",
"Treatment"),
treatment = as.numeric(str_extract(treatment, "\d+")),
treatment_label = ifelse(plot %in% "Original sample", "", treatment)) %>%
{. ->> lab_df} %>%
group_by(treatment) %>%
mutate(avg = ifelse(plot %in% "Original sample", NA, mean(value)),
xstart = treatment - avg_line_length,
xend = treatment + avg_line_length) %>%
ungroup() %>%
group_by(name) %>%
group_split() %>%
map(~.x %>% ggplot() +
geom_point(aes(x = treatment, y = value, color = factor(name))) +
geom_segment(aes(x = xstart, xend = xend, y = avg, yend = avg), colour="red") +
scale_x_continuous(breaks = lab_df$treatment, labels = lab_df$treatment_label) +
facet_wrap(~plot, scales = "free_x") +
labs(x = "Treatment", y = "Value", color = "Taxa") +
guides(x = guide_axis(angle = 90))+
theme_bw())
p2
#> [[1]]
#> Warning: Removed 5 rows containing missing values (geom_segment).
内容很乱,但希望能解决您的问题。
嗨,我有这样的数据:
总共 38 列。治疗列中有 10 种治疗类型,日期列数据示例代码中的日期为 25-29(示例为 2 种治疗类型,但数据有 10 种类型):
df <- structure(
list(
Christensenellaceae = c(
0,
0.009910731,
0.010131195,
0.009679938,
0.01147601,
0.010484508,
0.008641566,
0.010017172,
0.010741488,
0.1,
0.2,
0.3,
0.4),
Date=c(25,25,25,25,25,27,27,27,27,27,27,27,27),
Treatment = c(
"Original Sample",
"Original Sample",
"Original Sample",
"Original Sample",
"Original Sample"
"Treatment 1",
"Treatment 1",
"Treatment 1",
"Treatment 1",
"Treatment 2",
"Treatment 2",
"Treatment 2",
"Treatment 2")
),class = "data.frame",
row.names = c(NA,-9L)
)
我想做的是为每一列创建 2 个图,一个用于原始处理,另一个用于此处示例 (1-2) 中的所有处理类型 (1-10),并添加观察的平均线基于每种治疗类型。治疗区总共应该有 10 条平均线(这里是 2 条)。遗憾的是我不明白如何添加按治疗类型分组的行 这是我基于所有治疗类型的一行代码。我如何添加按治疗类型分组的行:
df_3 %>%
pivot_longer(-treatment) %>%
mutate(plot = ifelse(str_detect(treatment, "Original"),
"Original sample",
"Treatment"),
treatment = str_extract(treatment, "\d+$")) %>%
group_by(name) %>%
group_split() %>%
map(~.x %>% ggplot(aes(x = factor(treatment), y = value, color = factor(name))) +
geom_point() +
stat_summary(aes(y = value,group=1), fun.y=mean, colour="red", geom="line",group=1)
+
facet_wrap(~plot, scales = "free_x") +
labs(x = "Treatment", y = "Value", color = "Taxa") +
guides(x = guide_axis(angle = 90))+
theme_bw())
我也试过这段代码,但我似乎没有用
df %>%
pivot_longer(-c(Treatment, Date), names_to = "taxon")
%>% mutate( type = Treatment %>% str_detect("Original")
%>% ifelse("Original", "Treatment"), treatment_nr = Treatment
%>% str_extract("(?<=Treatment )[0-9]+") )
%>% ggplot(aes(Date, value, color = treatment_nr)) +
geom_point() + stat_summary( geom = "point", fun.y =
"mean", size = 3, shape = 24 ) + geom_line() + facet_grid(type
~ taxon, scales = "free_y") #> Warning: `fun.y` is deprecated.
Use `fun` instead.
您的数据格式不正确,并且与您的原始示例代码不匹配(例如 Treatment
而不是 treatment
)。无论如何,我将在这里生成一些数据,以便根据您的图像中的数据说明解决方案。
library(tidyverse)
set.seed(1)
df <-
data.frame(
Christensenellaceae = runif(105),
treatment = rep(c("Original Sample_25",
paste0("Treatment", 1:10, "_", 27),
paste0("Treatment", 1:10, "_", 28)),
each = 5)
)
因为您将平均值生成为一条线,所以它将连接在 x 轴上。我做了一个非常懒惰的工作,使用一个片段并在绘图之前计算平均值。根据您的十次治疗的外观,您可以通过更改 avg_line_length
.
因为该段还有额外的 x 轴值(例如 0.65、1.35),x 轴将默认包含这些额外值。我已经创建了标签和中断来解决这个问题,并且为此使用了中间数据 labs_df
。我把原来的留空了。您可以使用 color/linetype 来在图例中也将行显示为 'Mean'。
avg_line_length <- 0.35
p <-
df %>%
pivot_longer(-treatment) %>%
mutate(plot = ifelse(str_detect(treatment, "Original"),
"Original sample",
"Treatment"),
treatment = as.numeric(str_extract(treatment, "\d+")),
treatment_label = ifelse(plot %in% "Original sample", "", treatment)) %>%
{. ->> lab_df} %>%
group_by(treatment) %>%
mutate(avg = mean(value),
xstart = treatment - avg_line_length,
xend = treatment + avg_line_length) %>%
ungroup() %>%
group_by(name) %>%
group_split() %>%
map(~.x %>% ggplot() +
geom_point(aes(x = treatment, y = value, color = name)) +
geom_segment(aes(x = xstart, xend = xend, y = avg, yend = avg, color = name)) +
scale_x_continuous(breaks = lab_df$treatment, labels = lab_df$treatment_label) +
facet_wrap(~plot, scales = "free_x") +
labs(x = "Treatment", y = "Value", color = "Taxa") +
guides(x = guide_axis(angle = 90))+
theme_bw())
p
#> [[1]]
如果您不想要原始样本的平均线,只需额外 ifelse
.
p2 <-
df %>%
pivot_longer(-treatment) %>%
mutate(plot = ifelse(str_detect(treatment, "Original"),
"Original sample",
"Treatment"),
treatment = as.numeric(str_extract(treatment, "\d+")),
treatment_label = ifelse(plot %in% "Original sample", "", treatment)) %>%
{. ->> lab_df} %>%
group_by(treatment) %>%
mutate(avg = ifelse(plot %in% "Original sample", NA, mean(value)),
xstart = treatment - avg_line_length,
xend = treatment + avg_line_length) %>%
ungroup() %>%
group_by(name) %>%
group_split() %>%
map(~.x %>% ggplot() +
geom_point(aes(x = treatment, y = value, color = factor(name))) +
geom_segment(aes(x = xstart, xend = xend, y = avg, yend = avg), colour="red") +
scale_x_continuous(breaks = lab_df$treatment, labels = lab_df$treatment_label) +
facet_wrap(~plot, scales = "free_x") +
labs(x = "Treatment", y = "Value", color = "Taxa") +
guides(x = guide_axis(angle = 90))+
theme_bw())
p2
#> [[1]]
#> Warning: Removed 5 rows containing missing values (geom_segment).
内容很乱,但希望能解决您的问题。