如何使用带有参数化日期的 tibbletime 创建时间序列?

How to create a time series with tibbletime with parameterized dates?

我想为特定日期创建一个带有 tibbletime 的时间序列。 我有:

Data_Start<-"2015-09-07 01:55:00 UTC"
Data_End<-"2015-09-10 01:59:00 UTC"

我想创建一个时间序列,带有微小的样本,例如:

create_series(2015-09-07 + 01:55:00 ~ 2015-09-10 + 01:59:00,1~M)

参数应该是 time_formula,在第 17 页上有描述: https://cran.r-project.org/web/packages/tibbletime/tibbletime.pdf

这有效,但我无法传递如下参数:

create_series(Data_Start~Data_End,1~M)

尝试了不同的方法来转换字符串,但到目前为止没有任何效果:(

我使用 forecast()包创建了具有多个季节性的时间序列,在上述时间之间以分钟为频率。季节性周期因您的要求和数据长度而异

library(forecast)
Data_Start<-as.POSIXct("2015-09-07 01:55:00 UTC")
Data_End<-as.POSIXct("2015-09-10 01:59:00 UTC")

df = data.frame(tt = seq.POSIXt(Data_Start,Data_End,"min"),
                val = sample(1:40,4325,replace = T),stringsAsFactors = F)

# Seasonality Hourly, Daily
mts = msts(df$val,seasonal.periods = c(60,1440),start = Data_Start)
# Seasonality Hourly, Daily, Weekly
mts = msts(df$val,seasonal.periods = c(60,1440,10080),start = Data_Start)

tibbletime 的作者在这里。最近在 GitHub 上提出了一个问题。解决办法是用rlang::new_formula()预建公式。如果使用 POSIXct 日期,我们还需要一个特殊的辅助函数来处理在公式中添加 +

帮手来了:

# Time formula creator
# Can pass character, Date, POSIXct
create_time_formula <- function(lhs, rhs) {

  if(!inherits(lhs, c("character", "Date", "POSIXct"))) {
    stop("LHS must be a character or date")
  }
  if(!inherits(rhs, c("character", "Date", "POSIXct"))) {
    stop("RHS must be a character or date")
  }

  if(inherits(lhs, "Date")) {
    lhs <- as.character(lhs)
  } else if (inherits(lhs, "POSIXct")) {
    lhs <- gsub(" ", " + ", lhs)
  }

  if(inherits(rhs, "Date")) {
    rhs <- as.character(rhs)
  } else if (inherits(rhs, "POSIXct")) {
    rhs <- gsub(" ", " + ", rhs)
  }

  rlang::new_formula(lhs, rhs)
}

将辅助函数与开始日期和结束日期的日期版本一起使用

Data_Start<- as.POSIXct("2015-09-07 01:55:00")
Data_End  <- as.POSIXct("2015-09-10 01:59:00")

time_formula <- create_time_formula(Data_Start, Data_End)

create_series(time_formula, 1~M, tz = "UTC")

产生:

# A time tibble: 4,325 x 1
# Index: date
                  date
                <dttm>
 1 2015-09-07 01:55:00
 2 2015-09-07 01:56:00
 3 2015-09-07 01:57:00
 4 2015-09-07 01:58:00
 5 2015-09-07 01:59:00
 6 2015-09-07 02:00:00
 7 2015-09-07 02:01:00
 8 2015-09-07 02:02:00
 9 2015-09-07 02:03:00
10 2015-09-07 02:04:00
# ... with 4,315 more rows

tibbletime 的未来版本中,我可能会为这种情况包含更强大的 create_time_formula() 辅助函数版本。


更新: tibbletime 0.1.0已经发布,更健壮的实现允许直接在公式中使用变量。此外,公式的每一边 必须 是一个字符或与索引相同 class 的对象(即 2013 ~ 2014 应该是 "2013" ~ "2014" ).

library(tibbletime)

Data_Start<- as.POSIXct("2015-09-07 01:55:00")
Data_End  <- as.POSIXct("2015-09-10 01:59:00")

create_series(Data_Start ~ Data_End, "1 min")
#> # A time tibble: 4,325 x 1
#> # Index: date
#>    date               
#>    <dttm>             
#>  1 2015-09-07 01:55:00
#>  2 2015-09-07 01:56:00
#>  3 2015-09-07 01:57:00
#>  4 2015-09-07 01:58:00
#>  5 2015-09-07 01:59:00
#>  6 2015-09-07 02:00:00
#>  7 2015-09-07 02:01:00
#>  8 2015-09-07 02:02:00
#>  9 2015-09-07 02:03:00
#> 10 2015-09-07 02:04:00
#> # ... with 4,315 more rows