如何在 R 中创建 Year/Semester 格式的日期?
How can I create dates in Year/Semester format in R?
我想按两个月、四个月或六个月的时间段在 R 中汇总动物园数据。这种类型的日期处理只有两个可用的选项,使用:
a) as.yearmon
=> 处理按月分组的每日数据
b) as.yearqtr
=> 处理按 3 个月固定组(一月至三月、四月至六月、七月集和十月至十二月)分组的每日数据。
一个最小的例子
library(zoo)
# creating a vector of Dates
dt = as.Date(c("2001-01-01","2001-01-02","2001-04-01","2001-05-01","2001-07-01","2001-10-01"),
"%Y-%m-%d")
# the original dates
dt
[1] "2001-01-01" "2001-01-02" "2001-04-01" "2001-05-01" "2001-07-01" "2001-10-01"
# conversion to monthly data
as.yearmon(dt)
[1] "jan 2001" "jan 2001" "abr 2001" "mai 2001" "jul 2001" "out 2001"
# conversion to quarterly data
as.yearqtr(dt)
[1] "2001 Q1" "2001 Q1" "2001 Q2" "2001 Q2" "2001 Q3" "2001 Q4"
set.seed(0)
# irregular time series
daily_db = zoo(matrix(rnorm(3 * length(dt)),
nrow = length(dt),
ncol = 3),
order.by = dt)
daily_db
2001-01-01 1.2629543 -0.928567035 -1.1476570
2001-01-02 -0.3262334 -0.294720447 -0.2894616
2001-04-01 1.3297993 -0.005767173 -0.2992151
2001-05-01 1.2724293 2.404653389 -0.4115108
2001-07-01 0.4146414 0.763593461 0.2522234
2001-10-01 -1.5399500 -0.799009249 -0.8919211
# data aggregated by month
aggregate(daily_db,as.yearmon,sum)
V1 V2 V3
jan 2001 0.9367209 -1.223287482 -1.4371186
abr 2001 1.3297993 -0.005767173 -0.2992151
mai 2001 1.2724293 2.404653389 -0.4115108
jul 2001 0.4146414 0.763593461 0.2522234
out 2001 -1.5399500 -0.799009249 -0.8919211
# data aggregated by quarter
aggregate(daily_db,as.yearqtr,sum)
V1 V2 V3
2001 Q1 0.9367209 -1.2232875 -1.4371186
2001 Q2 2.6022286 2.3988862 -0.7107260
2001 Q3 0.4146414 0.7635935 0.2522234
2001 Q4 -1.5399500 -0.7990092 -0.8919211
我想定义一个像这样的函数:
as.yearperiod = function(x, period = 6) {...} # convert dates in semesters
要这样使用:
# data aggregated by semester
aggregate(base_dados_diaria, as.yearperiod, period = 6, sum)
我希望得到这样的结果:
V1 V2 V3
2001 S1 3.538950 1.175599 -2.147845
2001 S2 -1.125309 -0.035416 -0.639698
先生,建议您使用lubridate package, to deal with custom date intervals. Your task could be easy accomplished applying floor_date,如下:
six_m_interval <- lubridate::floor_date( dt , "6 months" )
# [1] "2001-01-01" "2001-01-01" "2001-01-01" "2001-01-01" "2001-07-01" "2001-07-01"
aggregate( daily_db , six_m_interval , sum )
# V1 V2 V3
# 2001-01-01 3.538950 1.17559873 -2.1478445
# 2001-07-01 -1.125309 -0.03541579 -0.6396977
Date2period
Date2period
输入一个 "Date"
对象和 returns 一个表示周期(学期等)的字符串,具体取决于参数 period
的值应该是一个除数为 12 的数字。它在内部转换为 yearmon
,然后提取年份和周期,即月份,并从中生成所需的字符串。
Date2period <- function(x, period = 6, sep = " S") {
ym <- as.yearmon(x)
paste(as.integer(ym), (cycle(ym) - 1) %/% period + 1, sep = sep)
}
测试以上:
library(zoo)
# inputs
period <- 6
dt <- as.Date(c("2001-01-01","2001-04-01","2001-07-01","2001-10-01"))
Date2period(dt)
## [1] "2001 S1" "2001 S1" "2001 S2" "2001 S2"
aggregate(daily_db, Date2period, sum)
## V1 V2 V3
## 2001 S1 0.9367209 -1.125309 2.39888622
## 2001 S2 2.6022286 -1.223287 -0.03541579
period2yearmon, period2Date
这里有额外的转换函数,但用于另一个方向:
period2yearmon <- function(x, period = 6) {
year <- as.numeric(sub("\D.*", "", x))
cyc <- as.numeric(sub(".*\D", "", x))
as.yearmon(year + period * (cyc - 1) / 12)
}
period2Date <- function(x, period = 6) as.Date(period2yearmon(x, period))
下面是对这些功能的一些测试。由于从 Date 到 period 并返回到 Date 给出了输入日期所在的期间开始的日期,我们在末尾的 aggregate
中显示效果。
# create a period string
d <- Date2period(dt)
## [1] "2001 S1" "2001 S1" "2001 S2" "2001 S2"
period2yearmon(d)
## [1] "Jan 2001" "Jan 2001" "Jul 2001" "Jul 2001"
period2Date(d)
## [1] "2001-01-01" "2001-01-01" "2001-07-01" "2001-07-01"
aggregate(daily_db, function(x) period2Date(Date2period(x)), sum)
## V1 V2 V3
## 2001-01-01 0.9367209 -1.125309 2.39888622
## 2001-07-01 2.6022286 -1.223287 -0.03541579
这可以通过创建诸如 yearmon
之类的 S3 对象来变得更加复杂,但对于问题中所示的目的来说并不是真正需要的。
我想按两个月、四个月或六个月的时间段在 R 中汇总动物园数据。这种类型的日期处理只有两个可用的选项,使用:
a) as.yearmon
=> 处理按月分组的每日数据
b) as.yearqtr
=> 处理按 3 个月固定组(一月至三月、四月至六月、七月集和十月至十二月)分组的每日数据。
一个最小的例子
library(zoo)
# creating a vector of Dates
dt = as.Date(c("2001-01-01","2001-01-02","2001-04-01","2001-05-01","2001-07-01","2001-10-01"),
"%Y-%m-%d")
# the original dates
dt
[1] "2001-01-01" "2001-01-02" "2001-04-01" "2001-05-01" "2001-07-01" "2001-10-01"
# conversion to monthly data
as.yearmon(dt)
[1] "jan 2001" "jan 2001" "abr 2001" "mai 2001" "jul 2001" "out 2001"
# conversion to quarterly data
as.yearqtr(dt)
[1] "2001 Q1" "2001 Q1" "2001 Q2" "2001 Q2" "2001 Q3" "2001 Q4"
set.seed(0)
# irregular time series
daily_db = zoo(matrix(rnorm(3 * length(dt)),
nrow = length(dt),
ncol = 3),
order.by = dt)
daily_db
2001-01-01 1.2629543 -0.928567035 -1.1476570
2001-01-02 -0.3262334 -0.294720447 -0.2894616
2001-04-01 1.3297993 -0.005767173 -0.2992151
2001-05-01 1.2724293 2.404653389 -0.4115108
2001-07-01 0.4146414 0.763593461 0.2522234
2001-10-01 -1.5399500 -0.799009249 -0.8919211
# data aggregated by month
aggregate(daily_db,as.yearmon,sum)
V1 V2 V3
jan 2001 0.9367209 -1.223287482 -1.4371186
abr 2001 1.3297993 -0.005767173 -0.2992151
mai 2001 1.2724293 2.404653389 -0.4115108
jul 2001 0.4146414 0.763593461 0.2522234
out 2001 -1.5399500 -0.799009249 -0.8919211
# data aggregated by quarter
aggregate(daily_db,as.yearqtr,sum)
V1 V2 V3
2001 Q1 0.9367209 -1.2232875 -1.4371186
2001 Q2 2.6022286 2.3988862 -0.7107260
2001 Q3 0.4146414 0.7635935 0.2522234
2001 Q4 -1.5399500 -0.7990092 -0.8919211
我想定义一个像这样的函数:
as.yearperiod = function(x, period = 6) {...} # convert dates in semesters
要这样使用:
# data aggregated by semester
aggregate(base_dados_diaria, as.yearperiod, period = 6, sum)
我希望得到这样的结果:
V1 V2 V3
2001 S1 3.538950 1.175599 -2.147845
2001 S2 -1.125309 -0.035416 -0.639698
先生,建议您使用lubridate package, to deal with custom date intervals. Your task could be easy accomplished applying floor_date,如下:
six_m_interval <- lubridate::floor_date( dt , "6 months" )
# [1] "2001-01-01" "2001-01-01" "2001-01-01" "2001-01-01" "2001-07-01" "2001-07-01"
aggregate( daily_db , six_m_interval , sum )
# V1 V2 V3
# 2001-01-01 3.538950 1.17559873 -2.1478445
# 2001-07-01 -1.125309 -0.03541579 -0.6396977
Date2period
Date2period
输入一个 "Date"
对象和 returns 一个表示周期(学期等)的字符串,具体取决于参数 period
的值应该是一个除数为 12 的数字。它在内部转换为 yearmon
,然后提取年份和周期,即月份,并从中生成所需的字符串。
Date2period <- function(x, period = 6, sep = " S") {
ym <- as.yearmon(x)
paste(as.integer(ym), (cycle(ym) - 1) %/% period + 1, sep = sep)
}
测试以上:
library(zoo)
# inputs
period <- 6
dt <- as.Date(c("2001-01-01","2001-04-01","2001-07-01","2001-10-01"))
Date2period(dt)
## [1] "2001 S1" "2001 S1" "2001 S2" "2001 S2"
aggregate(daily_db, Date2period, sum)
## V1 V2 V3
## 2001 S1 0.9367209 -1.125309 2.39888622
## 2001 S2 2.6022286 -1.223287 -0.03541579
period2yearmon, period2Date
这里有额外的转换函数,但用于另一个方向:
period2yearmon <- function(x, period = 6) {
year <- as.numeric(sub("\D.*", "", x))
cyc <- as.numeric(sub(".*\D", "", x))
as.yearmon(year + period * (cyc - 1) / 12)
}
period2Date <- function(x, period = 6) as.Date(period2yearmon(x, period))
下面是对这些功能的一些测试。由于从 Date 到 period 并返回到 Date 给出了输入日期所在的期间开始的日期,我们在末尾的 aggregate
中显示效果。
# create a period string
d <- Date2period(dt)
## [1] "2001 S1" "2001 S1" "2001 S2" "2001 S2"
period2yearmon(d)
## [1] "Jan 2001" "Jan 2001" "Jul 2001" "Jul 2001"
period2Date(d)
## [1] "2001-01-01" "2001-01-01" "2001-07-01" "2001-07-01"
aggregate(daily_db, function(x) period2Date(Date2period(x)), sum)
## V1 V2 V3
## 2001-01-01 0.9367209 -1.125309 2.39888622
## 2001-07-01 2.6022286 -1.223287 -0.03541579
这可以通过创建诸如 yearmon
之类的 S3 对象来变得更加复杂,但对于问题中所示的目的来说并不是真正需要的。