如果观察结果落在日期windows

Tally if observations fall in date windows

我有一个数据框,它表示具有开始日期和结束日期的政策。我正在尝试统计每个月有效的保单数量。

library(tidyverse)

ayear <- 2021
amonth <- 10
months <- 12

df <- tibble(
  pol = c(1, 2, 3, 4)
  , bdate = c('2021-02-23', '2019-12-03', '2020-08-11', '2020-12-14')
  , edate = c('2022-02-23', '2020-12-03', '2021-08-11', '2021-06-14')
  )

这四个政策有开始日期 (bdate) 和结束日期 (edate)。从 10 月(一个月)到 2021 年(一年),再往回推 12 个月(个月)我'我正在尝试统计 4 项政策中有多少在本月的某个时间处于活动状态,以生成看起来像这样的数据框。

我尝试生成的数据框将包含三列:月份、年份和 active_pol_count 12 行。像这样。

library(tidyverse)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union

df <- tibble(
  pol = c(1, 2, 3, 4),
  bdate = c("2021-02-23", "2019-12-03", "2020-08-11", "2020-12-14"),
  edate = c("2022-02-23", "2020-12-03", "2021-08-11", "2021-06-14")
)

# transform star and end date to interval
df <- mutate(df, interval = interval(bdate, edate))

# for every first date of each month between 2020-10 to 2021-10 
seq(as.Date("2020-10-01"), as.Date("2021-09-01"), by = "months") %>%
  tibble(date = .) %>%
  mutate(
    year = year(date),
    month = month(date),
    active_pol_count = date %>% map_dbl(~ .x %within% df$interval %>% sum()),
  )
#> # A tibble: 12 x 4
#>    date        year month active_pol_count
#>    <date>     <dbl> <dbl>            <dbl>
#>  1 2020-10-01  2020    10                2
#>  2 2020-11-01  2020    11                2
#>  3 2020-12-01  2020    12                2
#>  4 2021-01-01  2021     1                2
#>  5 2021-02-01  2021     2                2
#>  6 2021-03-01  2021     3                3
#>  7 2021-04-01  2021     4                3
#>  8 2021-05-01  2021     5                3
#>  9 2021-06-01  2021     6                3
#> 10 2021-07-01  2021     7                2
#> 11 2021-08-01  2021     8                2
#> 12 2021-09-01  2021     9                1

reprex package (v2.0.1)

于 2021-12-13 创建