按周数 (R) 过滤以日期列表为条件的时间序列

Filtering time series conditional on list of dates by week number (R)

我有一个数据框,其中包含每 30 分钟采样一次的时间序列(2016 年)。我需要创建一个子集,其中包含每个星期三 10:30:00(如果该周不包含周日至周三的假期),以及每个星期四 11:00:00(如果该周包含周日至周三的假期)。这将创建 EIA 石油每周报告发布时间表。我不想使用 xts.

我知道如何按星期几和一天中的时间进行子集化。但是我不知道如何以包含日期列表中存在的日期的那一周为条件进行子集化。我该怎么做?

下面的代码按星期几和一天中的时间创建一个子集,而不按假期过滤。它还包括用作过滤器的假期日期列表。

#Make time sequence every 30mins with Time & DayWk columns 
Calendar30mn <- as.data.frame(seq(as.POSIXlt("2016/1/1 00:00:00"), as.POSIXlt("2016/12/31 23:59:59"), by="30 mins"))
colnames(Calendar30mn) <- "DateTime"
Calendar30mn$Time <- strftime(Calendar30mn$DateTime, format="%H:%M:%S")
Calendar30mn$DayWk <- weekdays(Calendar30mn$DateTime)

#List of US Federal holidays falling on Sunday/Monday/Tuesday/Wedneday 
FedHolidaysSuntoWed <- structure(c(16818, 16846, 16951, 16986, 17049, 17161, 17084), class = "Date")  

-----

#Subset for Wednesday 10:30:00
EIAOildates1 <- subset (Calendar30mn, Time == "10:30:00" & DayWk == "Wednesday")

#Subset for Thursday 11:00:00
EIAOildates2 <- subset (Calendar30mn, Time == "11:00:00" & DayWk == "Thursday")

#Bind subsets and set reverse order (most recent at the top)
EIAOildates <- rbind(EIAOildates1, EIAOildates2)

以上代码生成 EIAOildates1,其中包含星期三 10:30:00 的一个子集。我希望该子集仅包含星期三 10:30:00 如果该周的任何一天不存在于 FedHolidaysSuntoWed 中。 EIAOildates2 反之亦然。

这是答案:

library(lubridate)

#Make time sequence every 30mins with Time & DayWk & WkNumber columns 
Calendar30mn <- as.data.frame(seq(as.POSIXlt("2016/1/1 00:00:00"), as.POSIXlt("2016/12/31 23:59:59"), by="30 mins"))
colnames(Calendar30mn) <- "DateTime"
Calendar30mn$Time <- strftime(Calendar30mn$DateTime, format="%H:%M:%S")
Calendar30mn$DayWk <- weekdays(Calendar30mn$DateTime)
Calendar30mn$WkNumber <- week(Calendar30mn$DateTime)

#List of US Federal holidays falling on Sunday/Monday/Tuesday/Wedneday & Corresponding WkNumber
FedHolidaysSuntoWed <- structure(c(16818, 16846, 16951, 16986, 17049, 17161, 17084), class = "Date")  
FedHolidaysSuntoWedWkNumber <- week(FedHolidaysSuntoWed)

#Subset for Wednesday 10:30:00
EIAOildates1 <- subset (Calendar30mn, Time == "10:30:00" & DayWk == "Wednesday"
                        & !(Calendar30mn$WkNumber %in% FedHolidaysSuntoWedWkNumber))

#Subset for Thursday 11:00:00
EIAOildates2 <- subset (Calendar30mn, Time == "11:00:00" & DayWk == "Thursday"
                        & (Calendar30mn$WkNumber %in% FedHolidaysSuntoWedWkNumber))

#Bind and sort subsets  
EIAOildates <- rbind(EIAOildates1, EIAOildates2)
EIAOildates <- EIAOildates[(order(as.Date(EIAOildates$DateTime))),]

这是 EIAOildates 的输出示例:

                 DateTime     Time     DayWk WkNumber
262   2016-01-06 10:30:00 10:30:00 Wednesday        1
598   2016-01-13 10:30:00 10:30:00 Wednesday        2
983   2016-01-21 11:00:00 11:00:00  Thursday        3
1270  2016-01-27 10:30:00 10:30:00 Wednesday        4
1606  2016-02-03 10:30:00 10:30:00 Wednesday        5
1942  2016-02-10 10:30:00 10:30:00 Wednesday        6
2327  2016-02-18 11:00:00 11:00:00  Thursday        7
16726 2016-12-14 10:30:00 10:30:00 Wednesday       50
17062 2016-12-21 10:30:00 10:30:00 Wednesday       51
17447 2016-12-29 11:00:00 11:00:00  Thursday       52