努力使用 tsibble 规范时间序列
struggling to regularize a time series using tsibble
我正在努力使用 tsibble 包对时间序列进行正则化。文档表明这可以使用 index_by()
+ summarise()
来完成,但我显然遗漏了一些细节。这是我尝试过的:
library(tidyverse)
library(lubridate)
library(tsibble)
# example data set
date <- ymd(c("1976-05-18", "1976-05-19", "1976-05-24", "1976-06-01"))
fish <- c(203, 282, 301, 89)
volume <- c(210749, 287555, 378965, 308935)
n <- c(5, 7, 10, 8)
tbl <- tibble(date, fish, volume, n)
tsbl <- tsibble(tbl, index = date, regular = FALSE)
# regularize the tsibble (ie time series)
tsbl %>%
index_by(date, unit = "day") %>% # unit value "day" is intuitive but incorrect?
mutate(week = isoweek(date)) %>% # add (numeric) week column
summarise(date = date,
fish = sum(fish),
volume = sum(volume),
n = sum(n),
cpue = fish/volume) # calculate catch per unit effort
TIA!
关于您实际尝试做的事情所提供的信息如此之少,我将不得不猜测。
也许您想要明确包含每一天的每日数据。在这种情况下,请执行以下操作:
library(tidyverse)
library(lubridate)
library(tsibble)
# example data set
date <- ymd(c("1976-05-18", "1976-05-19", "1976-05-24", "1976-06-01"))
fish <- c(203, 282, 301, 89)
volume <- c(210749, 287555, 378965, 308935)
n <- c(5, 7, 10, 8)
tbl <- tibble(date, fish, volume, n)
tsbl <- tsibble(tbl, index = date, regular = TRUE) %>%
fill_gaps()
tsbl
#> # A tsibble: 15 x 4 [1D]
#> date fish volume n
#> <date> <dbl> <dbl> <dbl>
#> 1 1976-05-18 203 210749 5
#> 2 1976-05-19 282 287555 7
#> 3 1976-05-20 NA NA NA
#> 4 1976-05-21 NA NA NA
#> 5 1976-05-22 NA NA NA
#> 6 1976-05-23 NA NA NA
#> 7 1976-05-24 301 378965 10
#> 8 1976-05-25 NA NA NA
#> 9 1976-05-26 NA NA NA
#> 10 1976-05-27 NA NA NA
#> 11 1976-05-28 NA NA NA
#> 12 1976-05-29 NA NA NA
#> 13 1976-05-30 NA NA NA
#> 14 1976-05-31 NA NA NA
#> 15 1976-06-01 89 308935 8
由 reprex package (v2.0.1)
于 2022-05-20 创建
我不确定您要通过汇总实现什么目的,但也许您想从这些每日数据中创建每周数据。在这种情况下,请执行以下操作:
tsbl %>%
mutate(week = isoweek(date)) %>% # add (numeric) week column
index_by(week) %>%
summarise(fish = sum(fish, na.rm=TRUE),
volume = sum(volume, na.rm=TRUE),
n = sum(n, na.rm=TRUE),
cpue = fish/volume) # calculate catch per unit effort
#> # A tsibble: 3 x 5 [1]
#> week fish volume n cpue
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 485 498304 12 0.000973
#> 2 22 301 378965 10 0.000794
#> 3 23 89 308935 8 0.000288
由 reprex package (v2.0.1)
于 2022-05-20 创建
我正在努力使用 tsibble 包对时间序列进行正则化。文档表明这可以使用 index_by()
+ summarise()
来完成,但我显然遗漏了一些细节。这是我尝试过的:
library(tidyverse)
library(lubridate)
library(tsibble)
# example data set
date <- ymd(c("1976-05-18", "1976-05-19", "1976-05-24", "1976-06-01"))
fish <- c(203, 282, 301, 89)
volume <- c(210749, 287555, 378965, 308935)
n <- c(5, 7, 10, 8)
tbl <- tibble(date, fish, volume, n)
tsbl <- tsibble(tbl, index = date, regular = FALSE)
# regularize the tsibble (ie time series)
tsbl %>%
index_by(date, unit = "day") %>% # unit value "day" is intuitive but incorrect?
mutate(week = isoweek(date)) %>% # add (numeric) week column
summarise(date = date,
fish = sum(fish),
volume = sum(volume),
n = sum(n),
cpue = fish/volume) # calculate catch per unit effort
TIA!
关于您实际尝试做的事情所提供的信息如此之少,我将不得不猜测。
也许您想要明确包含每一天的每日数据。在这种情况下,请执行以下操作:
library(tidyverse)
library(lubridate)
library(tsibble)
# example data set
date <- ymd(c("1976-05-18", "1976-05-19", "1976-05-24", "1976-06-01"))
fish <- c(203, 282, 301, 89)
volume <- c(210749, 287555, 378965, 308935)
n <- c(5, 7, 10, 8)
tbl <- tibble(date, fish, volume, n)
tsbl <- tsibble(tbl, index = date, regular = TRUE) %>%
fill_gaps()
tsbl
#> # A tsibble: 15 x 4 [1D]
#> date fish volume n
#> <date> <dbl> <dbl> <dbl>
#> 1 1976-05-18 203 210749 5
#> 2 1976-05-19 282 287555 7
#> 3 1976-05-20 NA NA NA
#> 4 1976-05-21 NA NA NA
#> 5 1976-05-22 NA NA NA
#> 6 1976-05-23 NA NA NA
#> 7 1976-05-24 301 378965 10
#> 8 1976-05-25 NA NA NA
#> 9 1976-05-26 NA NA NA
#> 10 1976-05-27 NA NA NA
#> 11 1976-05-28 NA NA NA
#> 12 1976-05-29 NA NA NA
#> 13 1976-05-30 NA NA NA
#> 14 1976-05-31 NA NA NA
#> 15 1976-06-01 89 308935 8
由 reprex package (v2.0.1)
于 2022-05-20 创建我不确定您要通过汇总实现什么目的,但也许您想从这些每日数据中创建每周数据。在这种情况下,请执行以下操作:
tsbl %>%
mutate(week = isoweek(date)) %>% # add (numeric) week column
index_by(week) %>%
summarise(fish = sum(fish, na.rm=TRUE),
volume = sum(volume, na.rm=TRUE),
n = sum(n, na.rm=TRUE),
cpue = fish/volume) # calculate catch per unit effort
#> # A tsibble: 3 x 5 [1]
#> week fish volume n cpue
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 485 498304 12 0.000973
#> 2 22 301 378965 10 0.000794
#> 3 23 89 308935 8 0.000288
由 reprex package (v2.0.1)
于 2022-05-20 创建