ggplot2 - 使用缺失值的数据制作连续图

ggplot2 - Make continuous plots with data that have missing values

假设我有几个 data.frames,每个有 2 个列:month 年(从 1 到 12)和 var 可以是任何随机变量。

我希望在 x 轴上绘制所有月份(从一月到十二月)。

问题是有些 data.frames 没有所有月份的观察结果,即有些是完整的,有些有间隙,有些被截断了。

如何绘制这些显示所有月份的数据?

这是一个代码示例

####
set.seed(69)

### Create sample data
df_1 = data.frame(month = c(1:5), var = rnorm(5)) # 7 months are missing
df_2 = data.frame(month = c(1:12), var = rnorm(12)) # year is complete with 12 months
df_3 = data.frame(month = c(1:3, 8:12), var = rnorm(8)) # gap of 4 months
df_4 = data.frame(month = c(1:2, 5, 10:12), var = rnorm(6)) # gap of 2 and 5 months


## Make list of data
df_lst = list(df_1, df_2, df_3, df_4)

### Plot
plot_lst = list()

for (i in 1:length(df_lst)) {
    plot_lst[[i]] = ggplot(data=df_lst[[i]], aes(x=month, y=var)) +
        geom_line(size=2) +
        scale_x_discrete(limits=c("Jan","Feb","Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")) +
        labs(title = '') +
        xlab('Months') +
        ylab('Var')
}

p_grid = cowplot::plot_grid(plotlist = plot_lst, ncol = 1)
print(cowplot::plot_grid(p_grid,
                         ncol = 1, rel_heights = c(1, 0.05)))

####

结果:

有什么建议吗?

无需重塑数据的最简单解决方案:

ggplot(df_4, aes(as.factor(month), var)) + geom_col() +
    scale_x_discrete(limits = c(1:12))

如果你想要(还是有点简单)线图

ggplot(df_4, aes(as.factor(month), var, group = 1)) +
geom_point(stat="summary", fun.y=sum, size = 3) +
stat_summary(fun.y=sum, geom="line") + 
scale_x_discrete(limits = c(1:12))

否则你将不得不

  • 估算数据或
  • 随心所欲similar to this answer