年度比较时间序列ggplot2 R

Yearly comparison timeseries ggplot2 R

我的 df:

> head(merged)
        Date patch     prod workmix_pct jobcounts travel FWIHweeklyAvg              month year
1 2013-03-29  BVG1 2.932208         100      9480   30.7      1.627024              March 2013
2 2013-03-29 BVG11 2.769156          10       968   34.3      4.475714              March 2013
3 2013-03-29 BVG12 2.857344          16      1551   33.8      3.098571              March 2013
4 2013-03-29 BVG13 2.870111          13      1267   29.1      1.361429              March 2013
5 2013-03-29 BVG14 3.011260          17      1625   28.1      1.550000              March 2013
6 2013-03-29 BVG15 3.236246          21      1946   24.9      1.392857              March 2013

我正在尝试绘制 prod 列的年度比较。我有从 March 2013March 2015 的数据。

这是我试过的:

ggplot(data=merged,aes(Date, prod)) + #dataframe 
  geom_line(data=merged[merged$patch %in% c("BVG1"),],aes(y=prod, colour="red"),lwd = 1.3,)+ #select BVG1
  geom_smooth() +
        scale_x_date(labels = date_format("%b-%Y"),breaks = "1 month") + #how many breaks and Date format
        ylab("Actual Prod") +
        ggtitle("Scotland's Overall Performance Financial Year\n2013/14 Vs 2014/15") +
        theme(axis.title.y = element_text(size = 25, vjust=0.3,face = "bold",color = "red"), 
        axis.text.y=element_text(size=25, color="blue"),
        plot.title = element_text(lineheight = .8,face = "bold",color = "red",size = 45, vjust = 1),
        legend.text = element_text(size=35))+ theme(legend.position="none")

这给了我这个情节:

现在我想绘制 2013 年与 2014 年,然后是 2014 年与 2015 年。最后是 2013 年与 2015 年。

这是我试过的:

ggplot(data=merged,aes(Date)) + #dataframe 
  geom_line(data=merged[merged$year==2013,],aes(y=prod, colour="red"),lwd = 1.3,)+ #select 2013
  geom_line(data=merged[merged$year==2014,],aes(y=prod, colour="blue"),lwd = 1.3,)+ #select 2014
        scale_x_date(labels = date_format("%b-%Y"),breaks = "1 month") + #how many breaks and Date format
        ylab("Actual Prod") +
        ggtitle("Scotland's Overall Performance Financial Year\n2013/14 Vs 2014/15") +
        theme(axis.title.y = element_text(size = 25, vjust=0.3,face = "bold",color = "red"), 
        axis.text.y=element_text(size=25, color="blue"),
        plot.title = element_text(lineheight = .8,face = "bold",color = "red",size = 45, vjust = 1),
        legend.text = element_text(size=35))+ theme(legend.position="none")

这就是我得到的:

如果有如下内容就好了:

并且:

但不是在 weekly 视图中,而是在 monthly 视图中。

如有任何帮助或想法,我们将不胜感激。

非常感谢

更新

根据 Ruthger Righart 的回答。我做了以下事情:

library(dplyr)

mergedYearonYearProdMeans = merged %>%
                                group_by(year,month) %>%
                                mutate(MonthlyAve = mean(prod))
ordered.months <- factor(mergedYearonYearProdMeans$month, as.character(mergedYearonYearProdMeans$month))

ggplot(data=mergedYearonYearProdMeans,aes(ordered.months,MonthlyAve,group=year,shape=year,color=year)) + #dataframe 
  geom_line()+ 
  scale_color_manual(values = c("red","blue","green"))

我的图表不是从 1 月 + 2015 年开始的 Prod 应该只针对 1 月、2 月和 3 月,并且不应显示其他月份的平坦绿线,如下所示。

通常数据的准备对于这类图来说是最重要的。 看到你的数据我猜你需要计算平均 "prod" 值作为年份和月份的函数。可以使用 ddply 函数使用 plyr 包执行此步骤。一个简单的数据示例,看看它是如何工作的:

library(plyr)

dat<-data.frame(year=c("2012","2012","2012", "2012","2012","2012"), month=c("Jan", "Jan", "Jan", "Feb", "Feb", "Feb"), prod=as.numeric(c("2.00", "1.00", "3.00", "0.50", "1.50", "2.00")))

newdat<-ddply(dat, .(year, month), summarize, prod = mean(prod)) 

完成此步骤后,您的数据应该在 newdat 中具有每年和每月的平均 "prod" 值,并且格式正确,以便可以使用 ggplot 绘制。我创建了一个具有相同格式的新简化数据示例:

df<-data.frame(year=c("2012","2012","2012","2012","2013","2013","2013","2013"), month=c("Jan","Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec", "Jan","Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"), prod=c("0.33","0.24","0.36","0.22","0.31","0.28","0.39","0.25", "0.23","0.22","0.46","0.52","0.61","0.18","0.59","0.55", "0.13","0.14","0.56","0.42","0.41","0.48","0.59","0.65"))

应该制作一个向量以获得 x 轴上月份的正确排名(否则 ggplot 按字母顺序排列月份)

ordmonth<- factor(df$month, as.character(df$month))

library(ggplot2)

p<-ggplot(data=df, aes(x=ordmonth, y=prod, group=year, shape=year, color=year))+geom_line()
p<-p+scale_color_manual(values = c("red", "blue"))