绘制多个时间序列对象的平均值并说明该图中的错误

Plotting the average of multiple time series objects and illustrating the error from that plot

考虑在此处创建 dat

set.seed(123)
ID = factor(letters[seq(6)])
time = c(100, 102, 120, 105, 109, 130)
dat <- data.frame(ID = rep(ID,time), Time = sequence(time))
dat$group <- rep(c("GroupA","GroupB"), c(322,344))
dat$values <- sample(100, nrow(dat), TRUE)

dat 包含 6 个个体(6 IDs)的时间序列数据,属于 2 个组(GroupAGroupB)。假设我们期望每个组内的时间序列具有相似的属性。另请注意,每个人的时间序列长度不同。我们基本上想为每个组创建一个“平均”时间序列图,我是这样做的:

library(dplyr)
library(ggplot2)
dat %>% 
  group_by(ID) %>%
  mutate(maxtime = max(Time)) %>%
  group_by(group) %>%
  mutate(maxtime = min(maxtime)) %>%
  group_by(group, Time) %>%
  summarize(values = mean(values)) %>%
  ggplot(aes(Time, values, colour = group))+ 
  geom_line()+
  facet_wrap(.~group)

我们如何做同样的事情,但在“平均”图后面添加每个个体的原始图,以说明与每个“平均值”相关的误差?请注意,我创建“平均图”的方式是使用 ID 的长度和每个组中最短的时间序列,但是当添加原件时,我想从原件中查看整个图如果可能的话(所以有些会比其他的长)

也许您正在寻找这样的组合情节:

library(dplyr)
library(ggplot2)
library(patchwork)
G1 <- dat %>% 
  group_by(ID) %>%
  mutate(maxtime = max(Time)) %>%
  group_by(group) %>%
  mutate(maxtime = min(maxtime)) %>%
  group_by(group, Time) %>%
  summarize(values = mean(values)) %>%
  ggplot(aes(Time, values, colour = group))+ 
  geom_line()+
  facet_wrap(.~group)+
  ylab('Mean')
G2 <- dat %>% 
  group_by(ID) %>%
  mutate(maxtime = max(Time)) %>%
  group_by(group) %>%
  mutate(maxtime = min(maxtime)) %>%
  ggplot(aes(Time, values, colour = group))+ 
  geom_line()+
  facet_wrap(.~group)+
  ylab('Real Values')
#Compose plots
G3 <- G2/G1+plot_layout(guides = "collect")

输出:

使用第二个 geom_line,您可以在后台绘制“原始”数据,例如灰线。

set.seed(123)
ID = factor(letters[seq(6)])
time = c(100, 102, 120, 105, 109, 130)
dat <- data.frame(ID = rep(ID,time), Time = sequence(time))
dat$group <- rep(c("GroupA","GroupB"), c(322,344))
dat$values <- sample(100, nrow(dat), TRUE)

library(dplyr)
library(ggplot2)
d <- dat %>% 
  group_by(ID) %>%
  mutate(maxtime = max(Time)) %>%
  group_by(group) %>%
  mutate(maxtime = min(maxtime)) %>%
  group_by(group, Time) %>%
  summarize(values = mean(values))
#> `summarise()` regrouping output by 'group' (override with `.groups` argument)

ggplot()+ 
  geom_line(data = dat, aes(Time, values, group = ID), color = "grey80", alpha = .7) +
  geom_line(data = d, aes(Time, values, colour = group)) +
  facet_wrap(.~group)