我如何使用 split 和 sapply 在 ggplot 中组合多个数据源?

how do i combine multiple data sources in ggplot using split and sapply?

此问题已 link 转至@Rui Barradas 和@Duck 回答的上一个问题,但我需要更多帮助。上一个 link 这里:

基本上,我需要将 3 个数据集合并到一个带有次要 y 轴的图中。所有数据集都需要按 SITENAME 拆分,并按 Sampling.Year 分面换行。我正在使用拆分和应用。小平面包装图看起来像这样:

但是,我现在正尝试将其他两个数据源添加到图中,看起来像这样:

但我正在努力添加其他两个数据源并让它们按 SITENAME 拆分。到目前为止,她是我的代码...

记录绘图格式作为应用于拆分列表 df 的函数(理想情况下 'df' 将被添加为带有辅助 y 轴的 geom_line,并且 'FF_start_dates' 将被添加作为垂直虚线):

SITENAME_plot <- function(AllDates_TPAF){
  ggplot(AllDates_TPAF, aes(DATE, Daily.Ave.PAF)) +
    geom_point(aes(colour = Risk), size = 3) +
     scale_colour_manual(values=c("Very Low" = "dark green","Low" = "light green", 
                                 "Moderate" = "yellow", "High" = "orange", "Very High" = "red"), drop = FALSE) +
     labs(x = "Month", y = "Total PAF (% affected)") +
            scale_x_date(breaks = "1 month", labels = scales::date_format("%B")) +
        facet_wrap(~Sampling.Year, ncol = 1, scales = "free")+
    scale_y_continuous(limits = c(0, 100), sec.axis = sec_axis(~., name = "Water level (m)")) +
    theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)) +
    theme(legend.text=element_text(size=15)) +
    theme(axis.text=element_text(size=15),
          axis.title=element_text(size=15,face="bold")) +
    guides(color = guide_legend(reverse = TRUE))+
    theme_bw() +
    ggtitle(unique(AllDates_TPAF$SITENAME))
}

绘图写入函数:

SITENAME_plot_write <- function(name, g, dir = "N:/abc/"){
  flname <- file.path(dir, name)
  flname <- paste0(flname, ".jpg")
  png(filename = flname, width = 1500, height = 1000)
  print(g)
  dev.off()
  flname
}

将函数应用于按 SITENAME 拆分的列表:

sp1 <- split(AllDates_TPAF, AllDates_TPAF$SITENAME)
gg_list <- sapply(sp1, SITENAME_plot, simplify = FALSE)
mapply(SITENAME_plot_write, names(gg_list), gg_list, MoreArgs = list(dir = getwd()))
dev.off()

我已经在此处上传了所有 3 个数据集的样本:Sample Data

很抱歉没有使用 gsub,但是数据太多,我无法让它正常工作

提前感谢您提供的任何帮助,即使这只是将我指向某种网络教程。

您可以尝试下一个代码。我使用了您共享的数据。请注意所有数据集的名称。理想情况下,在进行拆分之前,关键列 DATESampling.Year 应该存在于所有数据框中。还有一些变量 Risk 不存在,所以我添加了一个具有相同名称的示例变量。这里的代码,我为你想要的情节添加了一个函数:

library(tidyverse)
library(readxl)
#Data
df1 <- read_excel('Sample data.xlsx',1)
#Create var
df1$Risk <- c(rep(c("Very Low","Low","Moderate","High","Very High"),67),"Very High")
#Other data
df2 <- read_excel('Sample data.xlsx',2)
df3 <- read_excel('Sample data.xlsx',3)
#Split 1
L1 <- split(df1,df1$SITENAME)
L2 <- split(df2,df2$SITENAME)
L3 <- split(df3,df3$`Site Name`)
#Function to create plots
myplot <- function(x,y,z)
{
  #Merge x and y
  #Check for duplicates and avoid column
  y <- y[!duplicated(paste(y$DATE,y$Sampling.Year)),]
  y$SITENAME <- NULL
  xy <- merge(x,y,by.x = c('Sampling.Year','DATE'),by.y = c('Sampling.Year','DATE'),all.x=T)
  #Format to dates
  xy$DATE <- as.Date(xy$DATE)
  #Scale factor
  scaleFactor <- max(xy$Daily.Ave.PAF) / max(xy$Height)
  #Rename for consistency in names
  names(z)[4] <- 'DATE'
  #Format date
  z$DATE <- as.Date(z$DATE)
  #Plot
  #Plot
  G <- ggplot(xy, aes(DATE, Daily.Ave.PAF)) +
    geom_point(aes(colour = Risk), size = 3) +
    scale_colour_manual(values=c("Very Low" = "dark green","Low" = "light green", 
                                 "Moderate" = "yellow", "High" = "orange", "Very High" = "red"), drop = FALSE) +
    scale_x_date(breaks = "1 month", labels = scales::date_format("%b %Y")) +
    geom_line(aes(x=DATE,y=Height*scaleFactor))+
    scale_y_continuous(name="Total PAF (% affected)", sec.axis=sec_axis(~./scaleFactor, name="Water level (m)"))+
    labs(x = "Month") +
    geom_vline(data = z,aes(xintercept = DATE),linetype="dashed")+
    facet_wrap(~Sampling.Year, ncol = 1, scales = "free")+
    theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)) +
    theme(legend.text=element_text(size=15)) +
    theme(axis.text=element_text(size=15),
          axis.title=element_text(size=15,face="bold")) +
    guides(color = guide_legend(reverse = TRUE))+
    theme_bw() +
    ggtitle(unique(xy$SITENAME))
  return(G)
}
#Create a list of plots
Lplots <- mapply(FUN = myplot,x=L1,y=L2,z=L3,SIMPLIFY = FALSE)
#Now format names
vnames <- paste0(names(Lplots),'.png')
mapply(ggsave, Lplots,filename = vnames,width = 30,units = 'cm')

你最终会在你的目录中保存这样的图:

有些虚线没有出现在图中,因为它们不存在于您提供的数据中。