仅对时间序列中的工作日进行子集化

Question

我只想对 forecast 包中时间序列 taylor 中的工作日进行子集化。

help(taylor)

Half-hourly electricity demand in England and Wales from Monday 5 June 2000 to 
Sunday 27 August 2000. Discussed in Taylor (2003), and kindly provided by James
W Taylor. Units: Megawatts

但时间序列的时间不是日期，它们是从 1 开始的数字，表示从序列开始的天数：

time(head(taylor))  
Time Series:
Start = c(1, 1) 
End = c(1, 6) 
Frequency = 336 
[1] 1.000000 1.002976 1.005952 1.008929 1.011905 1.014881

如何将这些转换为日期，仅提取工作日样本，并创建一个频率为 5*24*2（而不是原始频率 7*24*2）的新时间序列？

Answer 1

我们可以创建一个 "half-hourly" 日期序列，其开始和结束日期在 taylor 的描述中给出。

data("taylor", package="forecast")

dates <- seq(as.POSIXct("2000-06-05"), as.POSIXct("2000-08-28"), "30 min")
dates <- dates[-length(dates)]  # exclude "2000-08-28 00:00:00"

现在，使用 substr()，我们可以排除以 "S" 开头的 weekdays()（可能不适用于其他语言）并创建一个新的 "ts" 对象开始、结束、频率值。

taylor2 <- ts(taylor[!substr(weekdays(dates), 1, 1) == "S"], start=1, end=12, frequency=240)

不过，最好使用 forecast 库创建一个 "msts" 对象以保持相同的季节性。

library(forecast)
taylor3 <- msts(taylor[!substr(weekdays(dates), 1, 1) == "S"], seasonal.periods=c(24*2, 24*2*5))

检查

op <- par(mfrow=c(3, 1))
plot(taylor)
plot(taylor2)
plot(taylor3)
par(op)

Answer 2

您可以考虑将时间序列转换为 xts 对象，以便更轻松地进行数据操作。例如，我们可以使用 .indexwkday:

从 xts 对象中提取工作日

library(xts)

## load data
data(taylor, package = "forecast")

## convert to xts
taylor_xts <- xts(
    x = taylor,
    order.by = seq(from = as.POSIXct("2000-06-05"), length = length(taylor), by = "30 min")
)

## extract weekdays
taylor_wk <- taylor_xts[.indexwday(taylor_xts) %in% 1:5]

head(taylor_wk); tail(taylor_wk)
#>                      [,1]
#> 2000-06-05 00:00:00 22262
#> 2000-06-05 00:30:00 21756
#> 2000-06-05 01:00:00 22247
#> 2000-06-05 01:30:00 22759
#> 2000-06-05 02:00:00 22549
#> 2000-06-05 02:30:00 22313
#>                      [,1]
#> 2000-08-25 21:00:00 33064
#> 2000-08-25 21:30:00 31953
#> 2000-08-25 22:00:00 30548
#> 2000-08-25 22:30:00 29236
#> 2000-08-25 23:00:00 27623
#> 2000-08-25 23:30:00 26063

或者，如果我们只想从办公时间（工作日上午 9 点到下午 6 点）提取数据：

## extract office hours
taylor_offh <- taylor_xts[.indexwday(taylor_xts) %in% 1:5 & .indexhour(taylor_xts) >= 9 & .indexhour(taylor_xts) < 18]

head(taylor_offh); tail(taylor_offh)
#>                      [,1]
#> 2000-06-05 09:00:00 36834
#> 2000-06-05 09:30:00 37296
#> 2000-06-05 10:00:00 37338
#> 2000-06-05 10:30:00 37608
#> 2000-06-05 11:00:00 37692
#> 2000-06-05 11:30:00 37944
#>                      [,1]
#> 2000-08-25 15:00:00 35067
#> 2000-08-25 15:30:00 34928
#> 2000-08-25 16:00:00 34738
#> 2000-08-25 16:30:00 35004
#> 2000-08-25 17:00:00 34748
#> 2000-08-25 17:30:00 34090

注意：使用 plot.xts 绘制子采样时间序列会在 x 轴上显示日期时间，因此包括周末的间隙，（因为时间序列不再在有规律的间隔）。要将数据绘制为串联序列，请使用 plot.default（或在转换回 ts 对象后使用 plot.ts）。

## plot time-series along time
plot(taylor_wk)

## plot time-series along index 
plot.default(taylor_wk, type = "l")    ## equivalently `plot(coredata(taylor_wk), type = "l")`

仅对时间序列中的工作日进行子集化

Subsetting only the workdays from a time-series

r

date

time-series

subset

forecasting

检查