tsibble -- 当有 none 时,你如何绕过隐含的差距

tsibble -- how do you get around implicit gaps when there are none

我是 tsibble 包的新手。我有每月的数据,我被迫使用寓言包。我遇到的一些问题

library(dplyr)
library(fable)
library(lubridate)
library(tsibble)

test <- data.frame(
   YearMonth = c(20160101, 20160201, 20160301, 20160401, 20160501, 20160601,
                 20160701, 20160801, 20160901, 20161001, 20161101, 20161201),
      Claims = c(13032647, 1668005, 24473616, 13640769, 17891432, 11596556,
                 23176360, 7885872, 11948461, 16194792, 4971310, 18032363),
     Revenue = c(12603367, 18733242, 5862766, 3861877, 15407158, 24534258,
                 15633646, 13720258, 24944078, 13375742, 4537475, 22988443)
)

test_ts <- test %>% 
  mutate(YearMonth = ymd(YearMonth)) %>% 
  as_tsibble(
    index = YearMonth,
    regular = FALSE       #because it picks up gaps when I set it to TRUE
    )

# Are there any gaps?
has_gaps(test_ts, .full = T)

model_new <- test_ts %>% 
  model(
  snaive = SNAIVE(Claims))
Warning messages:
1: 1 error encountered for snaive
[1] .data contains implicit gaps in time. You should check your data and convert implicit gaps into explicit missing values using `tsibble::fill_gaps()` if required.

任何帮助将不胜感激。

看起来 as_tsibble 无法正确识别 YearMonth 列中的间隔,因为它是 Date class 对象。它隐藏在可能有问题的帮助页面的 'Index' 部分:

For a tbl_ts of regular interval, a choice of index representation has to be made. For example, a monthly data should correspond to time index created by yearmonth or zoo::yearmon, instead of Date or POSIXct.

就像那个摘录建议你可以用 yearmonth() 解决这个问题。但这需要先进行一些字符串操作才能将其转换为可以正确解析的格式。

test_ts <- test %>% 
  mutate(YearMonth = gsub("(.{2})01$", "-\1", YearMonth) %>% 
           yearmonth()
         ) %>%
  as_tsibble(
    index = YearMonth
  )

现在模型应该 运行 没有错误了!不确定为什么 has_gaps() 测试说在你的例子中一切正常...

您有一个每日索引,但您想要一个每月索引。最简单的方法是使用 tsibble::yearmonth() 函数,但您需要先将日期转换为字符。

library(dplyr)
library(tsibble)

test <- data.frame(
  YearMonth = c(20160101, 20160201, 20160301, 20160401, 20160501, 20160601,
    20160701, 20160801, 20160901, 20161001, 20161101, 20161201),
  Claims = c(13032647, 1668005, 24473616, 13640769, 17891432, 11596556,
    23176360, 7885872, 11948461, 16194792, 4971310, 18032363),
  Revenue = c(12603367, 18733242, 5862766, 3861877, 15407158, 24534258,
    15633646, 13720258, 24944078, 13375742, 4537475, 22988443)
)

test_ts <- test %>%
  mutate(YearMonth = yearmonth(as.character(YearMonth))) %>%
  as_tsibble(index = YearMonth)