如何将每个时间点每个组的有效观察数添加到我的折线图中
How to add number of valid observations of each group at each timepoint to my linechart
我有两组平均值的演变。但是随着每个时间点有效观察值的数量发生变化,我想在每个时间点将每组有效值的数量添加到图表中。目的是让 reader 看到一段时间内的平均值不是针对相同数量的个体计算的
mydata<-data.frame(
ID=1:10,
groupe=c(rep("A",5),rep("B",5)),
value1=c(50,49,47,46,44,39,37,36,30,30),
value2=c(43,40,42,36,25,37,36,35,30,28),
value3=c(32,30,38,32,NA,34,36,32,27,NA),
value4=c(24,25,30,NA,NA,30,32,28,NA,28),
value5=c(24,22,NA,NA,NA,25,27,NA,NA,NA)
)
library(dplyr)
mydata2<-mydata %>%
group_by(groupe) %>%
summarise(mean_value1 = mean(value1),
mean_value2 = mean(value2),
mean_value3 = mean(value3,na.rm=T),
mean_value4 = mean(value4,na.rm=T),
mean_value5 = mean(value5,na.rm=T)
)
mydata2Lg<-mydata2%>%pivot_longer(
cols = mean_value1 :mean_value5,
names_to = "time",values_to = "mean",
names_prefix = "mean_value"
)
mydata2Lg$groupe<-as.factor(mydata2Lg$groupe)
ggplot(mydata2Lg,aes(x=time, y=mean, group=groupe,color=groupe))+
geom_line(aes(linetype=groupe),size=1)+
geom_point(aes(shape=groupe))
很抱歉没有清楚地说明我想要什么。希望你明白我的意思。
图形下方
显示由不同样本量引起的不确定性的典型方法是使用误差条或色带来指示标准误差。这为数据分布和样本量引入的不确定性提供了一个很好的视觉直觉。但是,您也可以添加计数标签。你只需要适当地总结你的数据。
为了完整起见,这里是您的数据,其中包含标准误差色带和每个时间点样本数量的标签:
library(tidyverse)
mydata %>%
pivot_longer(value1:value5) %>%
group_by(groupe, name) %>%
summarize(count = sum(!is.na(value)),
mean = mean(value, na.rm = TRUE),
sd = sd(value, na.rm = TRUE)) %>%
mutate(time = as.numeric(gsub("\D", "", name)),
upper = mean + sd/sqrt(count),
lower = mean - sd/sqrt(count)) %>%
ggplot(aes(time, mean, color = groupe)) +
geom_ribbon(aes(ymin = lower, ymax = upper, fill = groupe),
color = NA, alpha = 0.2) +
geom_point() +
geom_line() +
geom_label(aes(label = paste0("n = ", count),
y = mean + ifelse(groupe == "A", 1,-1)),
key_glyph = draw_key_blank) +
scale_color_manual(values = c("orangered3", "deepskyblue4")) +
scale_fill_manual(values = c("orangered3", "deepskyblue4")) +
labs(title = 'Mean values for each group over time \u00B1 standard error',
subtitle = expression(italic("Labels show sample size at each point"))) +
theme_light(base_size = 16)
我有两组平均值的演变。但是随着每个时间点有效观察值的数量发生变化,我想在每个时间点将每组有效值的数量添加到图表中。目的是让 reader 看到一段时间内的平均值不是针对相同数量的个体计算的
mydata<-data.frame(
ID=1:10,
groupe=c(rep("A",5),rep("B",5)),
value1=c(50,49,47,46,44,39,37,36,30,30),
value2=c(43,40,42,36,25,37,36,35,30,28),
value3=c(32,30,38,32,NA,34,36,32,27,NA),
value4=c(24,25,30,NA,NA,30,32,28,NA,28),
value5=c(24,22,NA,NA,NA,25,27,NA,NA,NA)
)
library(dplyr)
mydata2<-mydata %>%
group_by(groupe) %>%
summarise(mean_value1 = mean(value1),
mean_value2 = mean(value2),
mean_value3 = mean(value3,na.rm=T),
mean_value4 = mean(value4,na.rm=T),
mean_value5 = mean(value5,na.rm=T)
)
mydata2Lg<-mydata2%>%pivot_longer(
cols = mean_value1 :mean_value5,
names_to = "time",values_to = "mean",
names_prefix = "mean_value"
)
mydata2Lg$groupe<-as.factor(mydata2Lg$groupe)
ggplot(mydata2Lg,aes(x=time, y=mean, group=groupe,color=groupe))+
geom_line(aes(linetype=groupe),size=1)+
geom_point(aes(shape=groupe))
很抱歉没有清楚地说明我想要什么。希望你明白我的意思。
图形下方
显示由不同样本量引起的不确定性的典型方法是使用误差条或色带来指示标准误差。这为数据分布和样本量引入的不确定性提供了一个很好的视觉直觉。但是,您也可以添加计数标签。你只需要适当地总结你的数据。
为了完整起见,这里是您的数据,其中包含标准误差色带和每个时间点样本数量的标签:
library(tidyverse)
mydata %>%
pivot_longer(value1:value5) %>%
group_by(groupe, name) %>%
summarize(count = sum(!is.na(value)),
mean = mean(value, na.rm = TRUE),
sd = sd(value, na.rm = TRUE)) %>%
mutate(time = as.numeric(gsub("\D", "", name)),
upper = mean + sd/sqrt(count),
lower = mean - sd/sqrt(count)) %>%
ggplot(aes(time, mean, color = groupe)) +
geom_ribbon(aes(ymin = lower, ymax = upper, fill = groupe),
color = NA, alpha = 0.2) +
geom_point() +
geom_line() +
geom_label(aes(label = paste0("n = ", count),
y = mean + ifelse(groupe == "A", 1,-1)),
key_glyph = draw_key_blank) +
scale_color_manual(values = c("orangered3", "deepskyblue4")) +
scale_fill_manual(values = c("orangered3", "deepskyblue4")) +
labs(title = 'Mean values for each group over time \u00B1 standard error',
subtitle = expression(italic("Labels show sample size at each point"))) +
theme_light(base_size = 16)