为分段时间序列寻找锚点和斜率

Finding Anchors and slope for segmented time series

我有以下时间序列:

Lines <- "Hour,PF
0,14/01/2015 00:00,0.305
1,14/01/2015 01:00,0.306
2,14/01/2015 02:00,0.307
3,14/01/2015 03:00,0.3081
4,14/01/2015 04:00,0.3091
5,14/01/2015 05:00,0.3101
6,14/01/2015 06:00,0.3111
7,14/01/2015 07:00,0.3122
8,14/01/2015 08:00,0.455
9,14/01/2015 09:00,0.7103
10,14/01/2015 10:00,0.9656
11,14/01/2015 11:00,1
12,14/01/2015 12:00,0.9738
13,14/01/2015 13:00,0.9476
14,14/01/2015 14:00,0.9213
15,14/01/2015 15:00,0.8951
16,14/01/2015 16:00,0.8689
17,14/01/2015 17:00,0.8427
18,14/01/2015 18:00,0.6956
19,14/01/2015 19:00,0.6006
20,14/01/2015 20:00,0.5056
21,14/01/2015 21:00,0.4106
22,14/01/2015 22:00,0.3157
23,14/01/2015 23:00,0.3157"

library (zoo)
library (strucchange)

z <- read.zoo(text = Lines, tz = "", format = "%d/%m/%Y %H:%M", sep = ",")

bp <- breakpoints(z ~ 1, h = 2)

plot(z)
abline(v = time(z)[bp$breakpoints])
fit <- zoo(fitted(bp), time(z))
lines(fit, col = "blue", lty = 2, lwd = 2)
levs <- fit[bp$breakpoints + 0:1]
a<-diff(levs) / diff(as.numeric(time(levs)) / 3600)
DF <- fortify.zoo(a)

我得到以下 DF:

> DF
                Index             a
1 2015-01-14 10:00:00  2.061000e-01
2 2015-01-14 17:00:00 -9.516197e-17
3 2015-01-14 21:00:00 -1.448854e-01

我尝试更改 breakpoints 中的公式以获得具有斜率和截距的线性模型:

bp <- breakpoints(z ~ Lines$PF, h = 2)

没有成功。 我想要的最终结果是段的开始和段的结束,现在的斜率 (a) 和 Intercept ,段的左点(锚点)和段的右点。 如下(仅示例,与实数无关):

> DF
    Start Segment       End Segment             Slope          Intercept   Anchor Beginning Anchor End
1 2015-01-14 10:00:00  2015-01-14 08:00:00     2.061000e-01    8.123            0.50        0.30
2 2015-01-14 08:00:00  2015-01-14 17:00:00    -9.516197e-17    9.456            0.70        0.40
3 2015-01-14 17:00:00  2015-01-14 23:00:00    -1.448854e-01    2.9009           0.60        0.90

好吧,你已经有了断点,例如

(breaks <- data.frame(
  start = index(z[c(1, bp$breakpoints+1)]),
  end = c(index(z[bp$breakpoints]), index(z[length(z)]))
))
#                 start                 end
# 1 2015-01-14 00:00:00 2015-01-14 07:00:00
# 2 2015-01-14 08:00:00 2015-01-14 09:00:00
# 3 2015-01-14 10:00:00 2015-01-14 17:00:00
# 4 2015-01-14 18:00:00 2015-01-14 20:00:00
# 5 2015-01-14 21:00:00 2015-01-14 23:00:00
fits <- lapply(seq_len(nrow(breaks)), function(x) {
  idx <- index(z)>=breaks[x, 1] & index(z)<=breaks[x, 2]
  fit <- lm(z[idx]~index(z[idx]))
})
sapply(fits, coefficients)
#                        [,1]          [,2]          [,3]          [,4]          [,5]
# (Intercept)   -4.048094e+02 -1.007876e+05  8.358223e+03  3.750603e+04  1.873346e+04
# index(z[idx])  2.850529e-07  7.091667e-05 -5.880291e-06 -2.638889e-05 -1.318056e-05

最后一步是以您想要的格式合并您需要的所有数据。