R:如何在应用互相关之前格式化数据

R: how to format data before applying cross-correlation

我想对两个时间序列数据集进行互相关。 数据集如下所示。

> head(link420)
                  time diff420
1: 2018-01-01 08:00:00   18.50
2: 2018-01-01 08:05:00    0.00
3: 2018-01-01 08:10:00   -4.25
4: 2018-01-01 08:15:00    4.25
5: 2018-01-01 08:20:00   -8.50
6: 2018-01-01 08:25:00   47.00


> head(link423)
                  time    diff423
1: 2018-01-01 08:00:00   0.000000
2: 2018-01-01 08:05:00   1.700000
3: 2018-01-01 08:10:00 -22.818182
4: 2018-01-01 08:15:00  23.272727
5: 2018-01-01 08:20:00   4.160839
6: 2018-01-01 08:25:00  -9.337607

这两个数据集的格式是

> str(link420)
Classes ‘data.table’ and 'data.frame':  31 obs. of  2 variables:
 $ time   : POSIXct, format: "2018-01-01 08:00:00" "2018-01-01 08:05:00" "2018-01-01 08:10:00" "2018-01-01 08:15:00" ...
 $ diff420: num  18.5 0 -4.25 4.25 -8.5 47 -20 4 -5 -27 ...
 - attr(*, ".internal.selfref")=<externalptr> 


> str(link423)
Classes ‘data.table’ and 'data.frame':  31 obs. of  2 variables:
 $ time   : POSIXct, format: "2018-01-01 08:00:00" "2018-01-01 08:05:00" "2018-01-01 08:10:00" "2018-01-01 08:15:00" ...
 $ diff423: num  0 1.7 -22.82 23.27 4.16 ...
 - attr(*, ".internal.selfref")=<externalptr> 

我应该如何更改这些数据格式?

when I try
ccf(link420,link433)

it returns an error

Error in dimnames(x) <- dn : 
  length of 'dimnames' [2] not equal to array extent

所以我尝试了

link420<-xts(x = link420, order.by = link420$time)
link420<-link420[,c(2)]

link423<-xts(x = link423, order.by = link423$time)
link423<-link423[,c(2)]


but still gives an error

ccf(link420,link433)
Error in ccf(link420, link433) : univariate time series only

我想找出这两个数据集在什么时间段(以 5 分钟为间隔)显示出相关性。我能得到一些帮助吗?

您可能会发现像这样使用 tsibble 对象更容易。

library(tidyverse)
library(lubridate)
library(tsibble)
library(feasts)

link420 <- tibble(
    time = seq(as.POSIXct("2018-01-01 08:00:00"), length=100, by="5 min"),
    diff420 = rnorm(100)
  ) %>%
  as_tsibble(index=time)
link423 <- tibble(
    time = seq(as.POSIXct("2018-01-01 08:00:00"),length=100, by="5 min"),
    diff423 = rnorm(100)
  ) %>%
  as_tsibble(index=time)

inner_join(link420, link423, by = "time") %>%
  CCF(diff420, diff423)
#> # A tsibble: 33 x 2 [5m]
#>      lag     ccf
#>    <lag>   <dbl>
#>  1  -80m  0.0648
#>  2  -75m -0.0651
#>  3  -70m -0.0316
#>  4  -65m  0.0679
#>  5  -60m  0.0635
#>  6  -55m -0.158 
#>  7  -50m  0.0444
#>  8  -45m  0.0497
#>  9  -40m  0.0267
#> 10  -35m -0.0503
#> # … with 23 more rows

reprex package (v0.3.0)

于 2020 年 11 月 2 日创建