R中下一个工作日的舍入日期
Round date to next weekday in R
我目前正在努力处理 R 中的一些日期转换。我有一个包含日期列的大型财务数据集。由于周末不进行证券交易,因此我的数据集中只需要工作日。如何将此列中的日期四舍五入到前一个工作日?所以每个星期六和星期日应该转化为之前的星期五。在下面的摘录中,第一个日期是星期六,第二个日期是星期日。现在我想将这些转换为 2007-03-02 并保留其他行。
# A tibble: 6 x 5
Ticker Date mean_PX_ASK mean_PX_BID Agency
<chr> <date> <dbl> <dbl> <chr>
1 ABNANV 2007-03-03 102. 102. Moody's
2 ABNANV 2007-03-04 102. 102. Moody's
3 ABNANV 2007-03-12 102. 102. Moody's
4 ABNANV 2007-03-12 102. 102. Moody's
5 ABNANV 2008-09-17 88.9 88.4 Fitch
6 ABNANV 2008-09-17 88.9 88.4 Fitch
很高兴得到任何帮助!
一个简单的解决方案是使用 dplyr
中的 case_when
来检查那天的 weekday
是 "Saturday" 还是 "Sunday" 并相应地减去天数.
library(dplyr)
df %>%
mutate(Day = weekdays(Date),
Date = case_when(Day == "Saturday" ~ Date - 1,
Day == "Sunday" ~ Date - 2,
TRUE ~ Date)) %>%
select(-Day)
# Ticker Date mean_PX_ASK mean_PX_BID Agency
#1 ABNANV 2007-03-02 102.0 102.0 Moody's
#2 ABNANV 2007-03-02 102.0 102.0 Moody's
#3 ABNANV 2007-03-12 102.0 102.0 Moody's
#4 ABNANV 2007-03-12 102.0 102.0 Moody's
#5 ABNANV 2008-09-17 88.9 88.4 Fitch
#6 ABNANV 2008-09-17 88.9 88.4 Fitch
对于 bizdays
,我们需要使用 create.calendar
和默认值 weekdays
创建一个日历。然后我们可以使用 adjust.previous
来获取前一个工作日。
library(bizdays)
cal <- create.calendar("Actual", weekdays=c("saturday", "sunday"))
adjust.previous(df$Date, cal)
#[1] "2007-03-02" "2007-03-02" "2007-03-12" "2007-03-12" "2008-09-17" "2008-09-17"
在 base R 中,您可以将 format.Date
与格式字符串 %u
.
一起使用
dates <- as.Date(c('2007-03-02', '2007-03-03', '2007-03-04'))
wd <- as.integer(format(dates, '%u'))
as.Date(ifelse(wd >= 6, dates + 5 - wd, dates), origin = '1970-01-01')
#[1] "2007-03-02" "2007-03-02" "2007-03-02"
使用来自 lubridate 的 wday
:
library(lubridate)
# Generate some data
dfdate <- seq.Date(from = as.Date("2019-04-26"), to = as.Date("2019-04-28"), by = "day")
dfdate
[1] "2019-04-26" "2019-04-27" "2019-04-28"
wday
从周日开始,wday = 1
# Change all values to a Friday
dfdate[wday(dfdate) == 7] <- dfdate[wday(dfdate) == 7] - 1 # Saturdays to Fri
dfdate[wday(dfdate) == 1] <- dfdate[wday(dfdate) == 1] - 2 # Sundays to Fri
dfdate
[1] "2019-04-26" "2019-04-26" "2019-04-26"
它可以在没有任何包的情况下在一行中完成,或者 ifelse
如果我们使用命名向量
df$Date <- with(df, Date - setNames(rep(0:2, c(5, 1, 1)), 1:7)[format(Date, "%u")])
df
# Ticker Date mean_PX_ASK mean_PX_BID Agency
#1 ABNANV 2007-03-02 102.0 102.0 Moody's
#2 ABNANV 2007-03-02 102.0 102.0 Moody's
#3 ABNANV 2007-03-12 102.0 102.0 Moody's
#4 ABNANV 2007-03-12 102.0 102.0 Moody's
#5 ABNANV 2008-09-17 88.9 88.4 Fitch
#6 ABNANV 2008-09-17 88.9 88.4 Fitch
基准
使用更大的数据集
df1 <- df[rep(seq_len(nrow(df)), 1e7), ]
system.time({
df1 %>%
mutate(Day = weekdays(Date),
Date = case_when(Day == "Saturday" ~ Date - 1,
Day == "Sunday" ~ Date - 2,
TRUE ~ Date)) %>%
select(-Day)
})
# user system elapsed
# 41.468 6.881 49.588
system.time({
with(df1, Date - setNames(rep(0:2, c(5, 1, 1)), 1:7)[format(Date, "%u")])
})
# user system elapsed
# 27.456 2.785 30.490
与microbenchmark
、
library(microbenchmark)
microbenchmark(
rs = df1 %>%
mutate(Day = weekdays(Date),
Date = case_when(Day == "Saturday" ~ Date - 1,
Day == "Sunday" ~ Date - 2,
TRUE ~ Date)) %>%
select(-Day),
ak = with(df1, Date - setNames(rep(0:2, c(5, 1, 1)), 1:7)[format(Date, "%u")]),
times = 10L, unit = "relative")
#Unit: relative
# expr min lq mean median uq max neval cld
# rs 1.401658 1.437164 1.446403 1.421731 1.512451 1.467511 10 b
# ak 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 10 a
数据
df <- structure(list(Ticker = c("ABNANV", "ABNANV", "ABNANV", "ABNANV",
"ABNANV", "ABNANV"), Date = structure(c(13575, 13576, 13584,
13584, 14139, 14139), class = "Date"), mean_PX_ASK = c(102, 102,
102, 102, 88.9, 88.9), mean_PX_BID = c(102, 102, 102, 102, 88.4,
88.4), Agency = c("Moody's", "Moody's", "Moody's", "Moody's",
"Fitch", "Fitch")), row.names = c("1", "2", "3", "4", "5", "6"
), class = "data.frame")
我目前正在努力处理 R 中的一些日期转换。我有一个包含日期列的大型财务数据集。由于周末不进行证券交易,因此我的数据集中只需要工作日。如何将此列中的日期四舍五入到前一个工作日?所以每个星期六和星期日应该转化为之前的星期五。在下面的摘录中,第一个日期是星期六,第二个日期是星期日。现在我想将这些转换为 2007-03-02 并保留其他行。
# A tibble: 6 x 5
Ticker Date mean_PX_ASK mean_PX_BID Agency
<chr> <date> <dbl> <dbl> <chr>
1 ABNANV 2007-03-03 102. 102. Moody's
2 ABNANV 2007-03-04 102. 102. Moody's
3 ABNANV 2007-03-12 102. 102. Moody's
4 ABNANV 2007-03-12 102. 102. Moody's
5 ABNANV 2008-09-17 88.9 88.4 Fitch
6 ABNANV 2008-09-17 88.9 88.4 Fitch
很高兴得到任何帮助!
一个简单的解决方案是使用 dplyr
中的 case_when
来检查那天的 weekday
是 "Saturday" 还是 "Sunday" 并相应地减去天数.
library(dplyr)
df %>%
mutate(Day = weekdays(Date),
Date = case_when(Day == "Saturday" ~ Date - 1,
Day == "Sunday" ~ Date - 2,
TRUE ~ Date)) %>%
select(-Day)
# Ticker Date mean_PX_ASK mean_PX_BID Agency
#1 ABNANV 2007-03-02 102.0 102.0 Moody's
#2 ABNANV 2007-03-02 102.0 102.0 Moody's
#3 ABNANV 2007-03-12 102.0 102.0 Moody's
#4 ABNANV 2007-03-12 102.0 102.0 Moody's
#5 ABNANV 2008-09-17 88.9 88.4 Fitch
#6 ABNANV 2008-09-17 88.9 88.4 Fitch
对于 bizdays
,我们需要使用 create.calendar
和默认值 weekdays
创建一个日历。然后我们可以使用 adjust.previous
来获取前一个工作日。
library(bizdays)
cal <- create.calendar("Actual", weekdays=c("saturday", "sunday"))
adjust.previous(df$Date, cal)
#[1] "2007-03-02" "2007-03-02" "2007-03-12" "2007-03-12" "2008-09-17" "2008-09-17"
在 base R 中,您可以将 format.Date
与格式字符串 %u
.
dates <- as.Date(c('2007-03-02', '2007-03-03', '2007-03-04'))
wd <- as.integer(format(dates, '%u'))
as.Date(ifelse(wd >= 6, dates + 5 - wd, dates), origin = '1970-01-01')
#[1] "2007-03-02" "2007-03-02" "2007-03-02"
使用来自 lubridate 的 wday
:
library(lubridate)
# Generate some data
dfdate <- seq.Date(from = as.Date("2019-04-26"), to = as.Date("2019-04-28"), by = "day")
dfdate
[1] "2019-04-26" "2019-04-27" "2019-04-28"
wday
从周日开始,wday = 1
# Change all values to a Friday
dfdate[wday(dfdate) == 7] <- dfdate[wday(dfdate) == 7] - 1 # Saturdays to Fri
dfdate[wday(dfdate) == 1] <- dfdate[wday(dfdate) == 1] - 2 # Sundays to Fri
dfdate
[1] "2019-04-26" "2019-04-26" "2019-04-26"
它可以在没有任何包的情况下在一行中完成,或者 ifelse
如果我们使用命名向量
df$Date <- with(df, Date - setNames(rep(0:2, c(5, 1, 1)), 1:7)[format(Date, "%u")])
df
# Ticker Date mean_PX_ASK mean_PX_BID Agency
#1 ABNANV 2007-03-02 102.0 102.0 Moody's
#2 ABNANV 2007-03-02 102.0 102.0 Moody's
#3 ABNANV 2007-03-12 102.0 102.0 Moody's
#4 ABNANV 2007-03-12 102.0 102.0 Moody's
#5 ABNANV 2008-09-17 88.9 88.4 Fitch
#6 ABNANV 2008-09-17 88.9 88.4 Fitch
基准
使用更大的数据集
df1 <- df[rep(seq_len(nrow(df)), 1e7), ]
system.time({
df1 %>%
mutate(Day = weekdays(Date),
Date = case_when(Day == "Saturday" ~ Date - 1,
Day == "Sunday" ~ Date - 2,
TRUE ~ Date)) %>%
select(-Day)
})
# user system elapsed
# 41.468 6.881 49.588
system.time({
with(df1, Date - setNames(rep(0:2, c(5, 1, 1)), 1:7)[format(Date, "%u")])
})
# user system elapsed
# 27.456 2.785 30.490
与microbenchmark
、
library(microbenchmark)
microbenchmark(
rs = df1 %>%
mutate(Day = weekdays(Date),
Date = case_when(Day == "Saturday" ~ Date - 1,
Day == "Sunday" ~ Date - 2,
TRUE ~ Date)) %>%
select(-Day),
ak = with(df1, Date - setNames(rep(0:2, c(5, 1, 1)), 1:7)[format(Date, "%u")]),
times = 10L, unit = "relative")
#Unit: relative
# expr min lq mean median uq max neval cld
# rs 1.401658 1.437164 1.446403 1.421731 1.512451 1.467511 10 b
# ak 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 10 a
数据
df <- structure(list(Ticker = c("ABNANV", "ABNANV", "ABNANV", "ABNANV",
"ABNANV", "ABNANV"), Date = structure(c(13575, 13576, 13584,
13584, 14139, 14139), class = "Date"), mean_PX_ASK = c(102, 102,
102, 102, 88.9, 88.9), mean_PX_BID = c(102, 102, 102, 102, 88.4,
88.4), Agency = c("Moody's", "Moody's", "Moody's", "Moody's",
"Fitch", "Fitch")), row.names = c("1", "2", "3", "4", "5", "6"
), class = "data.frame")