努力使用 tsibble 规范时间序列

struggling to regularize a time series using tsibble

我正在努力使用 tsibble 包对时间序列进行正则化。文档表明这可以使用 index_by() + summarise() 来完成,但我显然遗漏了一些细节。这是我尝试过的:

library(tidyverse)
library(lubridate)
library(tsibble)

# example data set
date <- ymd(c("1976-05-18", "1976-05-19", "1976-05-24", "1976-06-01"))
fish <- c(203, 282, 301, 89)
volume <- c(210749, 287555, 378965, 308935)
n <- c(5, 7, 10, 8)
tbl <- tibble(date, fish, volume, n)
tsbl <- tsibble(tbl, index = date, regular = FALSE)
  
# regularize the tsibble (ie time series)
tsbl %>% 
  index_by(date, unit = "day") %>% # unit value "day" is intuitive but incorrect?
  mutate(week = isoweek(date)) %>% # add (numeric) week column
  summarise(date = date,
            fish = sum(fish),
            volume = sum(volume),
            n = sum(n), 
            cpue = fish/volume) # calculate catch per unit effort

TIA!

关于您实际尝试做的事情所提供的信息如此之少,我将不得不猜测。

也许您想要明确包含每一天的每日数据。在这种情况下,请执行以下操作:

library(tidyverse)
library(lubridate)
library(tsibble)

# example data set
date <- ymd(c("1976-05-18", "1976-05-19", "1976-05-24", "1976-06-01"))
fish <- c(203, 282, 301, 89)
volume <- c(210749, 287555, 378965, 308935)
n <- c(5, 7, 10, 8)
tbl <- tibble(date, fish, volume, n)
tsbl <- tsibble(tbl, index = date, regular = TRUE) %>%
  fill_gaps()
tsbl
#> # A tsibble: 15 x 4 [1D]
#>    date        fish volume     n
#>    <date>     <dbl>  <dbl> <dbl>
#>  1 1976-05-18   203 210749     5
#>  2 1976-05-19   282 287555     7
#>  3 1976-05-20    NA     NA    NA
#>  4 1976-05-21    NA     NA    NA
#>  5 1976-05-22    NA     NA    NA
#>  6 1976-05-23    NA     NA    NA
#>  7 1976-05-24   301 378965    10
#>  8 1976-05-25    NA     NA    NA
#>  9 1976-05-26    NA     NA    NA
#> 10 1976-05-27    NA     NA    NA
#> 11 1976-05-28    NA     NA    NA
#> 12 1976-05-29    NA     NA    NA
#> 13 1976-05-30    NA     NA    NA
#> 14 1976-05-31    NA     NA    NA
#> 15 1976-06-01    89 308935     8

reprex package (v2.0.1)

于 2022-05-20 创建

我不确定您要通过汇总实现什么目的,但也许您想从这些每日数据中创建每周数据。在这种情况下,请执行以下操作:

tsbl %>% 
  mutate(week = isoweek(date)) %>% # add (numeric) week column
  index_by(week) %>%
  summarise(fish = sum(fish, na.rm=TRUE),
            volume = sum(volume, na.rm=TRUE),
            n = sum(n, na.rm=TRUE), 
            cpue = fish/volume) # calculate catch per unit effort
#> # A tsibble: 3 x 5 [1]
#>    week  fish volume     n     cpue
#>   <dbl> <dbl>  <dbl> <dbl>    <dbl>
#> 1    21   485 498304    12 0.000973
#> 2    22   301 378965    10 0.000794
#> 3    23    89 308935     8 0.000288

reprex package (v2.0.1)

于 2022-05-20 创建