R计算每年超标的天数
R count days of exceedance per year
我的目标是计算数据框每一列每年超标的天数。我想为整个数据框使用一个固定值,并为每一列使用不同的值。对于整个数据帧的一个固定值,我找到了一个使用 count with aggregate and another solution using the package plyr with ddply and colwise 的解决方案。但我不知道如何为每一列使用不同的值。
一个固定值的方法:
# create example data
date <- seq(as.Date("1961/1/1"), as.Date("1963/12/31"), "days") # create dates
date <- date[(format.Date(as.Date(date), "%m %d") !="02 29")] # delete leap days
TempX <- rep(airquality$Temp, length.out=length(date))
TempY <- rep(rev(airquality$Temp), length.out=length(date))
df <- data.frame(date, TempX, TempY)
# This approachs works fine for specific values using aggregate.
library(plyr)
dyear <- as.numeric(format(df$date, "%Y")) # year vector
fa80 <- function (fT) {cft <- count(fT>=80); return(cft[2,2])}; # function for counting days of exceedance
aggregate(df[,-1], list(year=dyear), fa80) # use aggregate to apply function to dataframe
# Another approach using ddply with colwise, which works fine for one specific value.
fd80 <- function (fT) {cft <- count(fT>=80); cft[2,2]}; # function to count days of exceedance
ddply(cbind(df[,-1], dyear), .(dyear), colwise(fd80)) # use ddply to apply function colwise to dataframe
为了分别为每一列使用特定值,我尝试将第二个参数传递给该函数,但这没有用。
# pass second argument to function
Oc <- c(80,85) # values
fo80 <- function (fT,fR) {cft <- count(fT>=fR); return(cft[2,2])}; # function for counting days of exceedance
aggregate(df[,-1], list(year=dyear), fo80, fR=Oc) # use aggregate to apply function to dataframe
我尝试使用 apply.yearly, but it didn't work with count. I want to avoid using a loop,因为它很慢,而且我有很多数据帧超过 100 列和很长的时间序列要处理。
此外,该方法也必须适用于数据帧的 subsets。
# subset of dataframe
dfmay <- df[(format.Date(as.Date(df$date),"%m")=="05"),] # subset dataframe - only may
dyearmay <- as.numeric(format(dfmay$date, "%Y")) # year vector
aggregate(dfmay[,-1],list(year=dyearmay),fa80) # use aggregate to apply function to dataframe
我没思路了,怎么解决这个问题。任何帮助将不胜感激。
您可以尝试这样的操作:
#set the target temperature for each column
targets<-c(80,80)
dyear <- as.numeric(format(df$date, "%Y"))
#for each row of the data, check if the temp is above the target limit
#this will return a matrix of TRUE/FALSE
exceedance<-t(apply(df[,-1],1,function(x){x>=targets}))
#aggregate by year and sum
aggregate(exceedance,list(year=dyear),sum)
我的目标是计算数据框每一列每年超标的天数。我想为整个数据框使用一个固定值,并为每一列使用不同的值。对于整个数据帧的一个固定值,我找到了一个使用 count with aggregate and another solution using the package plyr with ddply and colwise 的解决方案。但我不知道如何为每一列使用不同的值。
一个固定值的方法:
# create example data
date <- seq(as.Date("1961/1/1"), as.Date("1963/12/31"), "days") # create dates
date <- date[(format.Date(as.Date(date), "%m %d") !="02 29")] # delete leap days
TempX <- rep(airquality$Temp, length.out=length(date))
TempY <- rep(rev(airquality$Temp), length.out=length(date))
df <- data.frame(date, TempX, TempY)
# This approachs works fine for specific values using aggregate.
library(plyr)
dyear <- as.numeric(format(df$date, "%Y")) # year vector
fa80 <- function (fT) {cft <- count(fT>=80); return(cft[2,2])}; # function for counting days of exceedance
aggregate(df[,-1], list(year=dyear), fa80) # use aggregate to apply function to dataframe
# Another approach using ddply with colwise, which works fine for one specific value.
fd80 <- function (fT) {cft <- count(fT>=80); cft[2,2]}; # function to count days of exceedance
ddply(cbind(df[,-1], dyear), .(dyear), colwise(fd80)) # use ddply to apply function colwise to dataframe
为了分别为每一列使用特定值,我尝试将第二个参数传递给该函数,但这没有用。
# pass second argument to function
Oc <- c(80,85) # values
fo80 <- function (fT,fR) {cft <- count(fT>=fR); return(cft[2,2])}; # function for counting days of exceedance
aggregate(df[,-1], list(year=dyear), fo80, fR=Oc) # use aggregate to apply function to dataframe
我尝试使用 apply.yearly, but it didn't work with count. I want to avoid using a loop,因为它很慢,而且我有很多数据帧超过 100 列和很长的时间序列要处理。
此外,该方法也必须适用于数据帧的 subsets。
# subset of dataframe
dfmay <- df[(format.Date(as.Date(df$date),"%m")=="05"),] # subset dataframe - only may
dyearmay <- as.numeric(format(dfmay$date, "%Y")) # year vector
aggregate(dfmay[,-1],list(year=dyearmay),fa80) # use aggregate to apply function to dataframe
我没思路了,怎么解决这个问题。任何帮助将不胜感激。
您可以尝试这样的操作:
#set the target temperature for each column
targets<-c(80,80)
dyear <- as.numeric(format(df$date, "%Y"))
#for each row of the data, check if the temp is above the target limit
#this will return a matrix of TRUE/FALSE
exceedance<-t(apply(df[,-1],1,function(x){x>=targets}))
#aggregate by year and sum
aggregate(exceedance,list(year=dyear),sum)