基于 ggplot 中的两个列表创建 qqPlots

Question

我有两个列表，每个列表有四个数据框。第一个列表 ("loc_list_OBS") 中的数据框只有两列 "Year" 和 "Mean_Precip" 而第二个列表 ("loc_list_Model") 中的数据框有 33 “年”列，然后是 32 种不同模型的平均降水值。

因此 loc_list_OBS 中的数据框看起来像这样，但数据一直持续到 2005 年：

Year     Mean_Precip
1965    799.1309
1966    748.0239
1967    619.7572
1968    799.9263
1969    680.9194
1970    766.2304
1971    599.5365
1972    717.8912
1973    739.4901
1974    707.1130
...     ....
2005    ....

loc_list_Model 中的数据框看起来像这样，但总共有 32 个模型列，数据也转到 2005 年：

Year   Model 1      Model 2      Model 3    ...... Model 32
1965    714.1101    686.5888    1048.4274
1966    1018.0095    766.9161     514.2700
1967    756.7066    902.2542     906.2877
1968    906.9675    919.5234     647.6630
1969    767.4008    861.1275     700.2612
1970    876.1538    738.8370     664.3342
1971    781.5092    801.2387     743.8965
1972    876.3522    819.4323     675.3022
1973    626.9468    927.0774     696.1884
1974    752.4084    824.7682     835.1566
....    .....       .....         .....
2005    .....       .....         .....

每个数据框代表一个地理位置，两个列表具有相同的四个位置，但一个列表用于观察值，另一个用于同一时间范围内的模型值。

我想创建 qqplots，将观察值的分位数与每个位置的每个模型的分位数进行比较。我还想要一个 pdf 上每个位置的 qqplots。我已经编写了将建模数据与标准正态分布进行比较并创建上述四个 pdf 的代码。该代码如下：

for (q in loc_list) local({
  qq_combine_plot <- gather(q, condition, measurement, 2:33, 
                            factor_key = TRUE)
  ggplot(qq_combine_plot, aes(sample = measurement)) +
    facet_wrap(~ condition, scales = "free") +
    stat_qq() +
    stat_qq_line()+
    ggtitle(paste("qqplot for Mean Yearly Precip \n NE 2020-59 RCP45", 
                  names(q)))+
    theme(plot.title = element_text(hjust = 0.5))+
    labs(y = "Mean Yearly Precip (mm)")
  ggsave(file=paste("qq_NE_59_s45_", names(q), ".pdf"), 
         device = pdf, height = 14, width = 14)
})

我能够创建 qqplots 来比较上述两个列表中的分位数，但我不知道如何使用 ggplot 执行此操作，并且仍然具有相同的 pdf 输出，其中图被组合并具有适当的模型标题。我为此使用的代码是：

myfun <- function(x,y)
{
  OBS_Data <- x$Mean_Precip
  for(i in 2:dim(y)[2])
{
    Model_Data <- y[,i]
    qqplot(x=OBS_Data, y=Model_Data, 
           ylab = "Model Quantile Values",
           xlab = "Observed Quantile Values")
  }
}

t.stat <- mapply(FUN = myfun,x=loc_list_OBS,y=loc_list_Model,SIMPLIFY = FALSE)

有人能帮我解决这个问题吗？

Answer 1

如果我没理解错的话，您想将第一个列表中的数据与第二个列表中的数据进行比较。然后为所有模型构建一个类似于 qqplot() 的 ggplot2 图。然后区分每个城市的地块并保存这些地块（如果你有 4 个位置，你应该需要 pdf 中的四张幻灯片）。在这种情况下，我建议使用循环的下一种方法。您包含的步骤很有用。为了比较这两个数据帧，您必须在 gather() 操作之后加入它们。可以计算 qqplot() 值，我将其包含在代码中。此解决方案是使用 tidyverse 函数完成的，因此请检查您是否已安装 id。最终输出将是 pdf，但我创建了一个列表 (List)，其中在打印之前存储了绘图。这里的代码使用虚拟列表（使用 df1 和 df2 创建，它们位于 post 的末尾）基于您共享的内容：

library(tidyverse)
#Code for data
#Data 1
List1 <- list(u1=df1,u2=df1,u3=df1,u4=df1)
#Data 2
List2 <- list(u1=df2,u2=df2,u3=df2,u4=df2)

现在设置达到想要的输出：

#Create an empty list to save the plots
List <- list()
#Loop any of List1 and List2 has the same length
for(i in 1:length(List1))
{
  x <- List1[[i]]
  y <- List2[[i]]
  #Text chain for names
  textchain <- names(List1[i])
  #First reshape data
  qq_combine_plot <- gather(y, condition, measurement, 2:dim(y)[2], 
                            factor_key = TRUE)
  #Now merge with original measure aka mean
  qqmer <- qq_combine_plot %>% left_join(x)
  #Now compute the qqplot measures
  r1 <- qqmer %>%
    group_by(condition) %>% 
    nest() %>% 
    mutate(qq = map(.x = data, ~as.data.frame(qqplot(x = .$Mean_Precip,
                                                     y = .$measurement, plot.it = FALSE)))) %>% 
    unnest(qq) 
  #Prepare plot
  G <- r1 %>%
    ggplot(aes(x = x, y = y)) + 
    geom_point() +
    facet_wrap(~condition,scales = 'free')+
    theme_bw()+theme(panel.grid = element_blank())+
    ylab("Model Quantile Values")+xlab("Observed Quantile Values")+
    ggtitle(paste0("qqplot for Mean Yearly Precip and modelled values between ",textchain," data"))
  #Assign to list
  List[[i]] <- G
}

该循环从两个列表中获取数据并复制步骤以绘制绘图并将它们保存在 List 中。

最后，我们使用另一个循环将绘图打印为 pdf。他们每个人的标题根据您的列表名称显示位置。在这种情况下，我将虚拟名称设置为 u1,...,u4:

#Export to pdf
pdf('Example.pdf',width = 14)
for(i in c(1:length(List)))
{
  plot(List[[i]])
}
dev.off()

最终输出将是您定义的某个目录中的 pdf。请注意 facet_wrap()。您可以使用上述函数具有的参数 nrow 和 ncol 调整图中的列数和行数。这里是生成的 pdf 的一些输出：

使用了一些数据：

#Data 1
df1 <- structure(list(Year = c(1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 
1971L, 1972L, 1973L, 1974L, 2005L), Mean_Precip = c(799.1309, 
748.0239, 619.7572, 799.9263, 680.9194, 766.2304, 599.5365, 717.8912, 
739.4901, 707.113, 707.113)), class = "data.frame", row.names = c(NA, 
-11L))

#Data 2
df2 <- structure(list(Year = c(1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 
1971L, 1972L, 1973L, 1974L, 2005L), Model.1 = c(714.1101, 1018.0095, 
756.7066, 906.9675, 767.4008, 876.1538, 781.5092, 876.3522, 626.9468, 
752.4084, 752.4084), Model.2 = c(686.5888, 766.9161, 902.2542, 
919.5234, 861.1275, 738.837, 801.2387, 819.4323, 927.0774, 824.7682, 
824.7682), Model.3 = c(1048.4274, 514.27, 906.2877, 647.663, 
700.2612, 664.3342, 743.8965, 675.3022, 696.1884, 835.1566, 835.1566
)), class = "data.frame", row.names = c(NA, -11L))

基于 ggplot 中的两个列表创建 qqPlots

Creating qqPlots based off of two lists in ggplot

statistics

r

ggplot2

quantile