扩展时间序列

Extending time series

我最初有 15 分钟间隔的流量数据,看起来像这样

structure(list(t = structure(c(1136062800, 1136063700, 1136064600, 
1136065500, 1136066400, 1136067300, 1136068200, 1136069100, 1136070000, 
1136070900, 1136071800, 1136072700, 1136073600, 1136074500, 1136075400, 
1136076300, 1136077200, 1136078100, 1136079000, 1136079900, 1136080800, 
1136081700, 1136082600, 1136083500, 1136084400, 1136085300, 1136086200, 
1136087100, 1136088000, 1136088900, 1136089800, 1136090700), class = c("POSIXct", 
"POSIXt"), tzone = "EST"), flow = c(23, 31, 42, 59, 59, 59, 50, 
48, 37, 33, 31, 31, 30, 30, 27, 27, 30, 31, 33, 37, 38, 42, 42, 
48, 48, 46, 42, 38, 37, 35, 33, 35)), .Names = c("t", "flow"), row.names = 35003:35034, class = "data.frame")

我使用此代码将此数据切割成 2 小时平均数据时间序列

data <- data.frame(t=streamflowDateTime,flow=streamflow) data2hr <- data data2hr$time <- cut(data2hr$t,breaks="2 hours") smoothedData <- aggregate(flow~time,data2hr,mean)

现在我想扩展 'smoothedData' 时间序列,使其成为每小时时间序列,但我希望新时间序列以 2 小时为间隔保留 smoothedData 时间序列的值,我想要平均值要插入现有时间序列之间的原始数据中的每小时数据。求助!

您可以使用函数 approxfun 来实现此结果。

首先,让我们将 smoothedData 中的 time 列转换回 POSIX 时间格式:

smoothedData$time <- as.POSIXct(smoothedData$time)
#                  time   flow
# 1 2005-12-31 16:00:00 46.375
# 2 2005-12-31 18:00:00 30.750
# 3 2005-12-31 20:00:00 37.625
# 4 2005-12-31 22:00:00 39.250

然后,创建一个以小时为间隔的时间序列:

data1hr <- data.frame(time=seq(min(smoothedData$time),max(smoothedData$time),by='1 hour'))

...制作一个插值函数,用于计算smoothedData数据集的两小时间隔之间的平均值:

interp.time <- approxfun(x=smoothedData$time,y=smoothedData$flow)

最后,将此函数应用于以小时分隔的时间序列:

data1hr$flow <- interp.time(data1hr$time)
#                  time    flow
# 1 2005-12-31 16:00:00 46.3750
# 2 2005-12-31 17:00:00 38.5625
# 3 2005-12-31 18:00:00 30.7500
# 4 2005-12-31 19:00:00 34.1875
# 5 2005-12-31 20:00:00 37.6250
# 6 2005-12-31 21:00:00 38.4375
# 7 2005-12-31 22:00:00 39.2500

编辑

根据 OP 的评论,

I feel like i explained my issue in a way more complicated than it has to be. Basically, at the even number hour timestamps, i.e. 00:00:00, 02:00:00 I want 2 hourly averages, but at odd number hour timestamps 01:00:00, 03:00:00 I want hourly averages

我想提出另一种解决问题的方法:

data <- structure(list(t = structure(c(1136062800, 1136063700, 1136064600,
1136065500, 1136066400, 1136067300, 1136068200, 1136069100, 1136070000,
1136070900, 1136071800, 1136072700, 1136073600, 1136074500, 1136075400,
1136076300, 1136077200, 1136078100, 1136079000, 1136079900, 1136080800,
1136081700, 1136082600, 1136083500, 1136084400, 1136085300, 1136086200,
1136087100, 1136088000, 1136088900, 1136089800, 1136090700), class = c("POSIXct",
"POSIXt"), tzone = "EST"), flow = c(23, 31, 42, 59, 59, 59, 50,
48, 37, 33, 31, 31, 30, 30, 27, 27, 30, 31, 33, 37, 38, 42, 42,
48, 48, 46, 42, 38, 37, 35, 33, 35)), .Names = c("t", "flow"), row.names = 35003:35034, class = "data.frame")

data2hr <- data
data2hr$time <- cut(data2hr$t,breaks="2 hours")
smoothedData2hr <- aggregate(flow~time,data2hr,mean)
#                 time   flow
# 1 2005-12-31 16:00:00 46.375
# 2 2005-12-31 18:00:00 30.750
# 3 2005-12-31 20:00:00 37.625
# 4 2005-12-31 22:00:00 39.250

data1hr <- data
data1hr$time <- cut(data1hr$t,breaks="1 hour")
smoothedData1hr <- aggregate(flow~time,data1hr,mean)
#                  time  flow
# 1 2005-12-31 16:00:00 38.75
# 2 2005-12-31 17:00:00 54.00
# 3 2005-12-31 18:00:00 33.00
# 4 2005-12-31 19:00:00 28.50
# 5 2005-12-31 20:00:00 32.75
# 6 2005-12-31 21:00:00 42.50
# 7 2005-12-31 22:00:00 43.50
# 8 2005-12-31 23:00:00 35.00

result <- smoothedData1hr
result$flow[match(smoothedData2hr$time,result$time)] <- smoothedData2hr$flow
#                  time   flow
# 1 2005-12-31 16:00:00 46.375
# 2 2005-12-31 17:00:00 54.000
# 3 2005-12-31 18:00:00 30.750
# 4 2005-12-31 19:00:00 28.500
# 5 2005-12-31 20:00:00 37.625
# 6 2005-12-31 21:00:00 42.500
# 7 2005-12-31 22:00:00 39.250
# 8 2005-12-31 23:00:00 35.000