R:如何在应用互相关之前格式化数据
R: how to format data before applying cross-correlation
我想对两个时间序列数据集进行互相关。
数据集如下所示。
> head(link420)
time diff420
1: 2018-01-01 08:00:00 18.50
2: 2018-01-01 08:05:00 0.00
3: 2018-01-01 08:10:00 -4.25
4: 2018-01-01 08:15:00 4.25
5: 2018-01-01 08:20:00 -8.50
6: 2018-01-01 08:25:00 47.00
> head(link423)
time diff423
1: 2018-01-01 08:00:00 0.000000
2: 2018-01-01 08:05:00 1.700000
3: 2018-01-01 08:10:00 -22.818182
4: 2018-01-01 08:15:00 23.272727
5: 2018-01-01 08:20:00 4.160839
6: 2018-01-01 08:25:00 -9.337607
这两个数据集的格式是
> str(link420)
Classes ‘data.table’ and 'data.frame': 31 obs. of 2 variables:
$ time : POSIXct, format: "2018-01-01 08:00:00" "2018-01-01 08:05:00" "2018-01-01 08:10:00" "2018-01-01 08:15:00" ...
$ diff420: num 18.5 0 -4.25 4.25 -8.5 47 -20 4 -5 -27 ...
- attr(*, ".internal.selfref")=<externalptr>
> str(link423)
Classes ‘data.table’ and 'data.frame': 31 obs. of 2 variables:
$ time : POSIXct, format: "2018-01-01 08:00:00" "2018-01-01 08:05:00" "2018-01-01 08:10:00" "2018-01-01 08:15:00" ...
$ diff423: num 0 1.7 -22.82 23.27 4.16 ...
- attr(*, ".internal.selfref")=<externalptr>
我应该如何更改这些数据格式?
when I try
ccf(link420,link433)
it returns an error
Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent
所以我尝试了
link420<-xts(x = link420, order.by = link420$time)
link420<-link420[,c(2)]
link423<-xts(x = link423, order.by = link423$time)
link423<-link423[,c(2)]
but still gives an error
ccf(link420,link433)
Error in ccf(link420, link433) : univariate time series only
我想找出这两个数据集在什么时间段(以 5 分钟为间隔)显示出相关性。我能得到一些帮助吗?
您可能会发现像这样使用 tsibble
对象更容易。
library(tidyverse)
library(lubridate)
library(tsibble)
library(feasts)
link420 <- tibble(
time = seq(as.POSIXct("2018-01-01 08:00:00"), length=100, by="5 min"),
diff420 = rnorm(100)
) %>%
as_tsibble(index=time)
link423 <- tibble(
time = seq(as.POSIXct("2018-01-01 08:00:00"),length=100, by="5 min"),
diff423 = rnorm(100)
) %>%
as_tsibble(index=time)
inner_join(link420, link423, by = "time") %>%
CCF(diff420, diff423)
#> # A tsibble: 33 x 2 [5m]
#> lag ccf
#> <lag> <dbl>
#> 1 -80m 0.0648
#> 2 -75m -0.0651
#> 3 -70m -0.0316
#> 4 -65m 0.0679
#> 5 -60m 0.0635
#> 6 -55m -0.158
#> 7 -50m 0.0444
#> 8 -45m 0.0497
#> 9 -40m 0.0267
#> 10 -35m -0.0503
#> # … with 23 more rows
由 reprex package (v0.3.0)
于 2020 年 11 月 2 日创建
我想对两个时间序列数据集进行互相关。 数据集如下所示。
> head(link420)
time diff420
1: 2018-01-01 08:00:00 18.50
2: 2018-01-01 08:05:00 0.00
3: 2018-01-01 08:10:00 -4.25
4: 2018-01-01 08:15:00 4.25
5: 2018-01-01 08:20:00 -8.50
6: 2018-01-01 08:25:00 47.00
> head(link423)
time diff423
1: 2018-01-01 08:00:00 0.000000
2: 2018-01-01 08:05:00 1.700000
3: 2018-01-01 08:10:00 -22.818182
4: 2018-01-01 08:15:00 23.272727
5: 2018-01-01 08:20:00 4.160839
6: 2018-01-01 08:25:00 -9.337607
这两个数据集的格式是
> str(link420)
Classes ‘data.table’ and 'data.frame': 31 obs. of 2 variables:
$ time : POSIXct, format: "2018-01-01 08:00:00" "2018-01-01 08:05:00" "2018-01-01 08:10:00" "2018-01-01 08:15:00" ...
$ diff420: num 18.5 0 -4.25 4.25 -8.5 47 -20 4 -5 -27 ...
- attr(*, ".internal.selfref")=<externalptr>
> str(link423)
Classes ‘data.table’ and 'data.frame': 31 obs. of 2 variables:
$ time : POSIXct, format: "2018-01-01 08:00:00" "2018-01-01 08:05:00" "2018-01-01 08:10:00" "2018-01-01 08:15:00" ...
$ diff423: num 0 1.7 -22.82 23.27 4.16 ...
- attr(*, ".internal.selfref")=<externalptr>
我应该如何更改这些数据格式?
when I try
ccf(link420,link433)
it returns an error
Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent
所以我尝试了
link420<-xts(x = link420, order.by = link420$time)
link420<-link420[,c(2)]
link423<-xts(x = link423, order.by = link423$time)
link423<-link423[,c(2)]
but still gives an error
ccf(link420,link433)
Error in ccf(link420, link433) : univariate time series only
我想找出这两个数据集在什么时间段(以 5 分钟为间隔)显示出相关性。我能得到一些帮助吗?
您可能会发现像这样使用 tsibble
对象更容易。
library(tidyverse)
library(lubridate)
library(tsibble)
library(feasts)
link420 <- tibble(
time = seq(as.POSIXct("2018-01-01 08:00:00"), length=100, by="5 min"),
diff420 = rnorm(100)
) %>%
as_tsibble(index=time)
link423 <- tibble(
time = seq(as.POSIXct("2018-01-01 08:00:00"),length=100, by="5 min"),
diff423 = rnorm(100)
) %>%
as_tsibble(index=time)
inner_join(link420, link423, by = "time") %>%
CCF(diff420, diff423)
#> # A tsibble: 33 x 2 [5m]
#> lag ccf
#> <lag> <dbl>
#> 1 -80m 0.0648
#> 2 -75m -0.0651
#> 3 -70m -0.0316
#> 4 -65m 0.0679
#> 5 -60m 0.0635
#> 6 -55m -0.158
#> 7 -50m 0.0444
#> 8 -45m 0.0497
#> 9 -40m 0.0267
#> 10 -35m -0.0503
#> # … with 23 more rows
由 reprex package (v0.3.0)
于 2020 年 11 月 2 日创建