在 R 中使用 cut.Date 并在周六开始周
Using cut.Date and Starting Weeks on Saturday in R
我目前正在以 4 周的时间间隔汇总数据,但我需要从星期六开始的星期。我在下面的可重现示例中使用下面的 cut.Date
行来创建一个指示每 4 周 window 的列,但我正在努力寻找资源(在 dplyr
或 lubridate
) 从星期六开始以 4 周的间隔进行汇总。
这是我的资料:
library(tidyverse)
# Example
example <- structure(list(
yr = c(2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016
),
date = structure(
c(17054, 17055, 17056, 17057, 17058, 17059,
17060, 17061, 17062, 17063, 17064, 17065, 17066, 17067, 17068,
17069, 17070, 17071, 17072, 17073, 17074, 17075, 17076, 17077,
17078, 17079, 17080, 17081, 17082, 17083, 17084, 17085, 17086,
17087, 17088, 17089, 17090, 17091, 17092, 17093), class = "Date"),
day_of_week = structure(
c(7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L,
3L, 4L),
.Label = c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri",
"Sat"), class = c("ordered", "factor"))),
row.names = c(NA, -40L),
class = c("tbl_df", "tbl", "data.frame"))
# Using this in a mutate to generate the week
example %>%
mutate(wk = cut.Date(date, breaks = "4 week", labels = FALSE)) %>%
print()
我希望它看起来像下面这样:
# Ideal output
output <- structure(list(
yr = c(2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016),
date = structure(
c(17054, 17055, 17056, 17057, 17058, 17059,
17060, 17061, 17062, 17063, 17064, 17065, 17066, 17067, 17068,
17069, 17070, 17071, 17072, 17073, 17074, 17075, 17076, 17077,
17078, 17079, 17080, 17081, 17082, 17083, 17084, 17085, 17086,
17087, 17088, 17089, 17090, 17091, 17092, 17093), class = "Date"),
day_of_week = structure(
c(7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L,
3L, 4L),
.Label = c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"),
class = c("ordered", "factor")),
wk = c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L)),
row.names = c(NA, -40L),
class = c("tbl_df", "tbl", "data.frame"))
同样,当使用星期六时,前 4 周从 2016-09-10 到 2016-10-08,这是我需要的。如有任何问题,请联系我,提前致谢!
我们将使用整数除法 (%/%
) 来确定“今天的日期”和数据中的第一个 "Fri"
之间发生了多少个 28 天的时间段。
首先,您的 output
在星期六开始和结束,这似乎并不一致。我假设您想在星期六开始每个 4 周的周期,所以它会在星期五结束。
基础 R
example$wk <- with(example, as.numeric(date - date[which.max(day_of_week == "Sat")]) %/% 28 + 1L)
print(example, n = 40)
# # A tibble: 40 x 4
# yr date day_of_week wk
# <dbl> <date> <ord> <dbl>
# 1 2016 2016-09-10 Sat 1
# 2 2016 2016-09-11 Sun 1
# 3 2016 2016-09-12 Mon 1
# 4 2016 2016-09-13 Tue 1
# 5 2016 2016-09-14 Wed 1
# 6 2016 2016-09-15 Thu 1
# 7 2016 2016-09-16 Fri 1
# 8 2016 2016-09-17 Sat 1
# 9 2016 2016-09-18 Sun 1
# 10 2016 2016-09-19 Mon 1
# 11 2016 2016-09-20 Tue 1
# 12 2016 2016-09-21 Wed 1
# 13 2016 2016-09-22 Thu 1
# 14 2016 2016-09-23 Fri 1
# 15 2016 2016-09-24 Sat 1
# 16 2016 2016-09-25 Sun 1
# 17 2016 2016-09-26 Mon 1
# 18 2016 2016-09-27 Tue 1
# 19 2016 2016-09-28 Wed 1
# 20 2016 2016-09-29 Thu 1
# 21 2016 2016-09-30 Fri 1
# 22 2016 2016-10-01 Sat 1
# 23 2016 2016-10-02 Sun 1
# 24 2016 2016-10-03 Mon 1
# 25 2016 2016-10-04 Tue 1
# 26 2016 2016-10-05 Wed 1
# 27 2016 2016-10-06 Thu 1
# 28 2016 2016-10-07 Fri 1
# 29 2016 2016-10-08 Sat 2
# 30 2016 2016-10-09 Sun 2
# 31 2016 2016-10-10 Mon 2
# 32 2016 2016-10-11 Tue 2
# 33 2016 2016-10-12 Wed 2
# 34 2016 2016-10-13 Thu 2
# 35 2016 2016-10-14 Fri 2
# 36 2016 2016-10-15 Sat 2
# 37 2016 2016-10-16 Sun 2
# 38 2016 2016-10-17 Mon 2
# 39 2016 2016-10-18 Tue 2
# 40 2016 2016-10-19 Wed 2
dplyr
library(dplyr)
example %>%
mutate(
wk = as.numeric(date - date[which.max(day_of_week == "Sat")]) %/% 28 + 1L
) %>%
print(n=99)
# # A tibble: 40 x 4
# yr date day_of_week wk
# <dbl> <date> <ord> <dbl>
# 1 2016 2016-09-10 Sat 1
# 2 2016 2016-09-11 Sun 1
# 3 2016 2016-09-12 Mon 1
# 4 2016 2016-09-13 Tue 1
# 5 2016 2016-09-14 Wed 1
# 6 2016 2016-09-15 Thu 1
# 7 2016 2016-09-16 Fri 1
# 8 2016 2016-09-17 Sat 1
# 9 2016 2016-09-18 Sun 1
# 10 2016 2016-09-19 Mon 1
# 11 2016 2016-09-20 Tue 1
# 12 2016 2016-09-21 Wed 1
# 13 2016 2016-09-22 Thu 1
# 14 2016 2016-09-23 Fri 1
# 15 2016 2016-09-24 Sat 1
# 16 2016 2016-09-25 Sun 1
# 17 2016 2016-09-26 Mon 1
# 18 2016 2016-09-27 Tue 1
# 19 2016 2016-09-28 Wed 1
# 20 2016 2016-09-29 Thu 1
# 21 2016 2016-09-30 Fri 1
# 22 2016 2016-10-01 Sat 1
# 23 2016 2016-10-02 Sun 1
# 24 2016 2016-10-03 Mon 1
# 25 2016 2016-10-04 Tue 1
# 26 2016 2016-10-05 Wed 1
# 27 2016 2016-10-06 Thu 1
# 28 2016 2016-10-07 Fri 1
# 29 2016 2016-10-08 Sat 2
# 30 2016 2016-10-09 Sun 2
# 31 2016 2016-10-10 Mon 2
# 32 2016 2016-10-11 Tue 2
# 33 2016 2016-10-12 Wed 2
# 34 2016 2016-10-13 Thu 2
# 35 2016 2016-10-14 Fri 2
# 36 2016 2016-10-15 Sat 2
# 37 2016 2016-10-16 Sun 2
# 38 2016 2016-10-17 Mon 2
# 39 2016 2016-10-18 Tue 2
# 40 2016 2016-10-19 Wed 2
我目前正在以 4 周的时间间隔汇总数据,但我需要从星期六开始的星期。我在下面的可重现示例中使用下面的 cut.Date
行来创建一个指示每 4 周 window 的列,但我正在努力寻找资源(在 dplyr
或 lubridate
) 从星期六开始以 4 周的间隔进行汇总。
这是我的资料:
library(tidyverse)
# Example
example <- structure(list(
yr = c(2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016
),
date = structure(
c(17054, 17055, 17056, 17057, 17058, 17059,
17060, 17061, 17062, 17063, 17064, 17065, 17066, 17067, 17068,
17069, 17070, 17071, 17072, 17073, 17074, 17075, 17076, 17077,
17078, 17079, 17080, 17081, 17082, 17083, 17084, 17085, 17086,
17087, 17088, 17089, 17090, 17091, 17092, 17093), class = "Date"),
day_of_week = structure(
c(7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L,
3L, 4L),
.Label = c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri",
"Sat"), class = c("ordered", "factor"))),
row.names = c(NA, -40L),
class = c("tbl_df", "tbl", "data.frame"))
# Using this in a mutate to generate the week
example %>%
mutate(wk = cut.Date(date, breaks = "4 week", labels = FALSE)) %>%
print()
我希望它看起来像下面这样:
# Ideal output
output <- structure(list(
yr = c(2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016),
date = structure(
c(17054, 17055, 17056, 17057, 17058, 17059,
17060, 17061, 17062, 17063, 17064, 17065, 17066, 17067, 17068,
17069, 17070, 17071, 17072, 17073, 17074, 17075, 17076, 17077,
17078, 17079, 17080, 17081, 17082, 17083, 17084, 17085, 17086,
17087, 17088, 17089, 17090, 17091, 17092, 17093), class = "Date"),
day_of_week = structure(
c(7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L,
3L, 4L),
.Label = c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"),
class = c("ordered", "factor")),
wk = c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L)),
row.names = c(NA, -40L),
class = c("tbl_df", "tbl", "data.frame"))
同样,当使用星期六时,前 4 周从 2016-09-10 到 2016-10-08,这是我需要的。如有任何问题,请联系我,提前致谢!
我们将使用整数除法 (%/%
) 来确定“今天的日期”和数据中的第一个 "Fri"
之间发生了多少个 28 天的时间段。
首先,您的 output
在星期六开始和结束,这似乎并不一致。我假设您想在星期六开始每个 4 周的周期,所以它会在星期五结束。
基础 R
example$wk <- with(example, as.numeric(date - date[which.max(day_of_week == "Sat")]) %/% 28 + 1L)
print(example, n = 40)
# # A tibble: 40 x 4
# yr date day_of_week wk
# <dbl> <date> <ord> <dbl>
# 1 2016 2016-09-10 Sat 1
# 2 2016 2016-09-11 Sun 1
# 3 2016 2016-09-12 Mon 1
# 4 2016 2016-09-13 Tue 1
# 5 2016 2016-09-14 Wed 1
# 6 2016 2016-09-15 Thu 1
# 7 2016 2016-09-16 Fri 1
# 8 2016 2016-09-17 Sat 1
# 9 2016 2016-09-18 Sun 1
# 10 2016 2016-09-19 Mon 1
# 11 2016 2016-09-20 Tue 1
# 12 2016 2016-09-21 Wed 1
# 13 2016 2016-09-22 Thu 1
# 14 2016 2016-09-23 Fri 1
# 15 2016 2016-09-24 Sat 1
# 16 2016 2016-09-25 Sun 1
# 17 2016 2016-09-26 Mon 1
# 18 2016 2016-09-27 Tue 1
# 19 2016 2016-09-28 Wed 1
# 20 2016 2016-09-29 Thu 1
# 21 2016 2016-09-30 Fri 1
# 22 2016 2016-10-01 Sat 1
# 23 2016 2016-10-02 Sun 1
# 24 2016 2016-10-03 Mon 1
# 25 2016 2016-10-04 Tue 1
# 26 2016 2016-10-05 Wed 1
# 27 2016 2016-10-06 Thu 1
# 28 2016 2016-10-07 Fri 1
# 29 2016 2016-10-08 Sat 2
# 30 2016 2016-10-09 Sun 2
# 31 2016 2016-10-10 Mon 2
# 32 2016 2016-10-11 Tue 2
# 33 2016 2016-10-12 Wed 2
# 34 2016 2016-10-13 Thu 2
# 35 2016 2016-10-14 Fri 2
# 36 2016 2016-10-15 Sat 2
# 37 2016 2016-10-16 Sun 2
# 38 2016 2016-10-17 Mon 2
# 39 2016 2016-10-18 Tue 2
# 40 2016 2016-10-19 Wed 2
dplyr
library(dplyr)
example %>%
mutate(
wk = as.numeric(date - date[which.max(day_of_week == "Sat")]) %/% 28 + 1L
) %>%
print(n=99)
# # A tibble: 40 x 4
# yr date day_of_week wk
# <dbl> <date> <ord> <dbl>
# 1 2016 2016-09-10 Sat 1
# 2 2016 2016-09-11 Sun 1
# 3 2016 2016-09-12 Mon 1
# 4 2016 2016-09-13 Tue 1
# 5 2016 2016-09-14 Wed 1
# 6 2016 2016-09-15 Thu 1
# 7 2016 2016-09-16 Fri 1
# 8 2016 2016-09-17 Sat 1
# 9 2016 2016-09-18 Sun 1
# 10 2016 2016-09-19 Mon 1
# 11 2016 2016-09-20 Tue 1
# 12 2016 2016-09-21 Wed 1
# 13 2016 2016-09-22 Thu 1
# 14 2016 2016-09-23 Fri 1
# 15 2016 2016-09-24 Sat 1
# 16 2016 2016-09-25 Sun 1
# 17 2016 2016-09-26 Mon 1
# 18 2016 2016-09-27 Tue 1
# 19 2016 2016-09-28 Wed 1
# 20 2016 2016-09-29 Thu 1
# 21 2016 2016-09-30 Fri 1
# 22 2016 2016-10-01 Sat 1
# 23 2016 2016-10-02 Sun 1
# 24 2016 2016-10-03 Mon 1
# 25 2016 2016-10-04 Tue 1
# 26 2016 2016-10-05 Wed 1
# 27 2016 2016-10-06 Thu 1
# 28 2016 2016-10-07 Fri 1
# 29 2016 2016-10-08 Sat 2
# 30 2016 2016-10-09 Sun 2
# 31 2016 2016-10-10 Mon 2
# 32 2016 2016-10-11 Tue 2
# 33 2016 2016-10-12 Wed 2
# 34 2016 2016-10-13 Thu 2
# 35 2016 2016-10-14 Fri 2
# 36 2016 2016-10-15 Sat 2
# 37 2016 2016-10-16 Sun 2
# 38 2016 2016-10-17 Mon 2
# 39 2016 2016-10-18 Tue 2
# 40 2016 2016-10-19 Wed 2