R bizdays::adjust.previous 在检查日期是否为 NA 时的意外行为
R unexpected behaviour of bizdays::adjust.previous when checking if date is NA
我尝试使用 bizdays 包将数据框中的日期转换为工作日。这个数据框可能有一些缺失值 (NA),所以我添加了一个 ifelse 语句来忽略这些空单元格,但它似乎破坏了代码,我不知道为什么。
这是错误的一个小例子:
library(bizdays)
library(dplyr)
holidays <- c("2022-03-01",
"2022-03-07",
"2022-03-08",
"2022-03-25")
start_date = as.Date("01/01/2010", format = "%d/%m/%Y")
end_date = as.Date("01/01/2060", format = "%d/%m/%Y")
calendar <- create.calendar("my_cal",
holidays = holidays,
weekdays =c("saturday", "sunday"),
start.date = start_date,
end.date = end_date)
bizdays.options$set(default.calendar="my_cal")
date_1 <- "2022-03-13" # sunday
print(adjust.previous(date_1)) # friday "2022-03-11"
days <- c()
for (i in c(1:31)) {
days <- c(days, paste("2022-03-", formatC(i, width = 2, flag = '0'), sep = ""))
}
df <- data.frame(days = days)
df_1 <- df %>% mutate(days_1 = adjust.previous(days))
head(df_1) # correct
# days days_1
#1 2022-03-01 2022-02-28
#2 2022-03-02 2022-03-02
#3 2022-03-03 2022-03-03
#4 2022-03-04 2022-03-04
#5 2022-03-05 2022-03-04
#6 2022-03-06 2022-03-04
df_2 <- df %>% mutate(days_2 = ifelse(is.na(days),
days,
adjust.previous(days)))
head(df_2) # date is converted to a number
# days days_2
#1 2022-03-01 19051
#2 2022-03-02 19053
#3 2022-03-03 19054
#4 2022-03-04 19055
#5 2022-03-05 19055
#6 2022-03-06 19055
这与 bizdays
包无关,而是与 class Date
的 ifelse()
returns 对象如何作为数字有关。看这个例子:
class(Sys.Date()) # Date
ifelse(TRUE, Sys.Date(), Sys.Date()) # 19066
class(ifelse(TRUE, Sys.Date(), Sys.Date())) # numeric
反之:
if(TRUE) class(Sys.Date()) # Date
在你的情况下,在我看来 ifelse()
是不必要的,因为 adjust.previous
处理 NA
值:
df$days[1] = NA
df_2 <- df %>% mutate(
days_2 = adjust.previous(days)
)
# Seems to work
head(df_2)
# days days_2
# 1 <NA> <NA>
# 2 2022-03-02 2022-03-02
# 3 2022-03-03 2022-03-03
# 4 2022-03-04 2022-03-04
# 5 2022-03-05 2022-03-04
# 6 2022-03-06 2022-03-04
但是,如果这对您的真实数据不起作用,我会离开 dplyr
世界,这很好但在对列进行子集化时稍微弱一些,并在基础 R 中进行:
df_3 <- df
df_3$days_3 <- as.Date(0, origin = "1970-01-01") # Create date column
df_3$days_3[is.na(df_3$days)] <- NA # Fill NA
df_3$days_3[!is.na(df_3$days)] <- adjust.previous(df_3$days[!is.na(df_3$days)]) # Fill values
# Output as above
head(df_3)
# days days_3
# 1 <NA> <NA>
# 2 2022-03-02 2022-03-02
# 3 2022-03-03 2022-03-03
# 4 2022-03-04 2022-03-04
# 5 2022-03-05 2022-03-04
# 6 2022-03-06 2022-03-04
我尝试使用 bizdays 包将数据框中的日期转换为工作日。这个数据框可能有一些缺失值 (NA),所以我添加了一个 ifelse 语句来忽略这些空单元格,但它似乎破坏了代码,我不知道为什么。
这是错误的一个小例子:
library(bizdays)
library(dplyr)
holidays <- c("2022-03-01",
"2022-03-07",
"2022-03-08",
"2022-03-25")
start_date = as.Date("01/01/2010", format = "%d/%m/%Y")
end_date = as.Date("01/01/2060", format = "%d/%m/%Y")
calendar <- create.calendar("my_cal",
holidays = holidays,
weekdays =c("saturday", "sunday"),
start.date = start_date,
end.date = end_date)
bizdays.options$set(default.calendar="my_cal")
date_1 <- "2022-03-13" # sunday
print(adjust.previous(date_1)) # friday "2022-03-11"
days <- c()
for (i in c(1:31)) {
days <- c(days, paste("2022-03-", formatC(i, width = 2, flag = '0'), sep = ""))
}
df <- data.frame(days = days)
df_1 <- df %>% mutate(days_1 = adjust.previous(days))
head(df_1) # correct
# days days_1
#1 2022-03-01 2022-02-28
#2 2022-03-02 2022-03-02
#3 2022-03-03 2022-03-03
#4 2022-03-04 2022-03-04
#5 2022-03-05 2022-03-04
#6 2022-03-06 2022-03-04
df_2 <- df %>% mutate(days_2 = ifelse(is.na(days),
days,
adjust.previous(days)))
head(df_2) # date is converted to a number
# days days_2
#1 2022-03-01 19051
#2 2022-03-02 19053
#3 2022-03-03 19054
#4 2022-03-04 19055
#5 2022-03-05 19055
#6 2022-03-06 19055
这与 bizdays
包无关,而是与 class Date
的 ifelse()
returns 对象如何作为数字有关。看这个例子:
class(Sys.Date()) # Date
ifelse(TRUE, Sys.Date(), Sys.Date()) # 19066
class(ifelse(TRUE, Sys.Date(), Sys.Date())) # numeric
反之:
if(TRUE) class(Sys.Date()) # Date
在你的情况下,在我看来 ifelse()
是不必要的,因为 adjust.previous
处理 NA
值:
df$days[1] = NA
df_2 <- df %>% mutate(
days_2 = adjust.previous(days)
)
# Seems to work
head(df_2)
# days days_2
# 1 <NA> <NA>
# 2 2022-03-02 2022-03-02
# 3 2022-03-03 2022-03-03
# 4 2022-03-04 2022-03-04
# 5 2022-03-05 2022-03-04
# 6 2022-03-06 2022-03-04
但是,如果这对您的真实数据不起作用,我会离开 dplyr
世界,这很好但在对列进行子集化时稍微弱一些,并在基础 R 中进行:
df_3 <- df
df_3$days_3 <- as.Date(0, origin = "1970-01-01") # Create date column
df_3$days_3[is.na(df_3$days)] <- NA # Fill NA
df_3$days_3[!is.na(df_3$days)] <- adjust.previous(df_3$days[!is.na(df_3$days)]) # Fill values
# Output as above
head(df_3)
# days days_3
# 1 <NA> <NA>
# 2 2022-03-02 2022-03-02
# 3 2022-03-03 2022-03-03
# 4 2022-03-04 2022-03-04
# 5 2022-03-05 2022-03-04
# 6 2022-03-06 2022-03-04