如果观察结果落在日期windows
Tally if observations fall in date windows
我有一个数据框,它表示具有开始日期和结束日期的政策。我正在尝试统计每个月有效的保单数量。
library(tidyverse)
ayear <- 2021
amonth <- 10
months <- 12
df <- tibble(
pol = c(1, 2, 3, 4)
, bdate = c('2021-02-23', '2019-12-03', '2020-08-11', '2020-12-14')
, edate = c('2022-02-23', '2020-12-03', '2021-08-11', '2021-06-14')
)
这四个政策有开始日期 (bdate) 和结束日期 (edate)。从 10 月(一个月)到 2021 年(一年),再往回推 12 个月(个月)我'我正在尝试统计 4 项政策中有多少在本月的某个时间处于活动状态,以生成看起来像这样的数据框。
我尝试生成的数据框将包含三列:月份、年份和 active_pol_count 12 行。像这样。
library(tidyverse)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
df <- tibble(
pol = c(1, 2, 3, 4),
bdate = c("2021-02-23", "2019-12-03", "2020-08-11", "2020-12-14"),
edate = c("2022-02-23", "2020-12-03", "2021-08-11", "2021-06-14")
)
# transform star and end date to interval
df <- mutate(df, interval = interval(bdate, edate))
# for every first date of each month between 2020-10 to 2021-10
seq(as.Date("2020-10-01"), as.Date("2021-09-01"), by = "months") %>%
tibble(date = .) %>%
mutate(
year = year(date),
month = month(date),
active_pol_count = date %>% map_dbl(~ .x %within% df$interval %>% sum()),
)
#> # A tibble: 12 x 4
#> date year month active_pol_count
#> <date> <dbl> <dbl> <dbl>
#> 1 2020-10-01 2020 10 2
#> 2 2020-11-01 2020 11 2
#> 3 2020-12-01 2020 12 2
#> 4 2021-01-01 2021 1 2
#> 5 2021-02-01 2021 2 2
#> 6 2021-03-01 2021 3 3
#> 7 2021-04-01 2021 4 3
#> 8 2021-05-01 2021 5 3
#> 9 2021-06-01 2021 6 3
#> 10 2021-07-01 2021 7 2
#> 11 2021-08-01 2021 8 2
#> 12 2021-09-01 2021 9 1
由 reprex package (v2.0.1)
于 2021-12-13 创建
我有一个数据框,它表示具有开始日期和结束日期的政策。我正在尝试统计每个月有效的保单数量。
library(tidyverse)
ayear <- 2021
amonth <- 10
months <- 12
df <- tibble(
pol = c(1, 2, 3, 4)
, bdate = c('2021-02-23', '2019-12-03', '2020-08-11', '2020-12-14')
, edate = c('2022-02-23', '2020-12-03', '2021-08-11', '2021-06-14')
)
这四个政策有开始日期 (bdate) 和结束日期 (edate)。从 10 月(一个月)到 2021 年(一年),再往回推 12 个月(个月)我'我正在尝试统计 4 项政策中有多少在本月的某个时间处于活动状态,以生成看起来像这样的数据框。
我尝试生成的数据框将包含三列:月份、年份和 active_pol_count 12 行。像这样。
library(tidyverse)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
df <- tibble(
pol = c(1, 2, 3, 4),
bdate = c("2021-02-23", "2019-12-03", "2020-08-11", "2020-12-14"),
edate = c("2022-02-23", "2020-12-03", "2021-08-11", "2021-06-14")
)
# transform star and end date to interval
df <- mutate(df, interval = interval(bdate, edate))
# for every first date of each month between 2020-10 to 2021-10
seq(as.Date("2020-10-01"), as.Date("2021-09-01"), by = "months") %>%
tibble(date = .) %>%
mutate(
year = year(date),
month = month(date),
active_pol_count = date %>% map_dbl(~ .x %within% df$interval %>% sum()),
)
#> # A tibble: 12 x 4
#> date year month active_pol_count
#> <date> <dbl> <dbl> <dbl>
#> 1 2020-10-01 2020 10 2
#> 2 2020-11-01 2020 11 2
#> 3 2020-12-01 2020 12 2
#> 4 2021-01-01 2021 1 2
#> 5 2021-02-01 2021 2 2
#> 6 2021-03-01 2021 3 3
#> 7 2021-04-01 2021 4 3
#> 8 2021-05-01 2021 5 3
#> 9 2021-06-01 2021 6 3
#> 10 2021-07-01 2021 7 2
#> 11 2021-08-01 2021 8 2
#> 12 2021-09-01 2021 9 1
由 reprex package (v2.0.1)
于 2021-12-13 创建