如何根据开始日期和结束日期计算季度内的观察次数？

Question

   patient_id dt_diag_init enroll    
      <int64> <date>       <date>    
 1      10401 2018-01-04   2020-09-30
 2      60701 2019-05-31   2019-09-30
 3     343702 2018-12-05   2020-09-30
 4     472202 2019-12-30   2020-09-30
 5     489502 2019-09-17   2019-11-30
 6     557401 2019-10-15   2020-09-30
 7     857901 2018-01-02   2020-09-30
 8     874201 2018-01-01   2020-09-30
 9    1309102 2019-03-11   2020-09-30
10    1317601 2018-08-14   2020-09-30

我正在尝试根据日期范围（dt_diag_init 和注册）统计每季度的患者人数。 patient_id 10401 从 2018-01-04 开始到 2020-09-30，因此由于日期范围重叠，患者将被计入 Q1'18、Q2'18 .... 到 Q3'20。我正在尝试获得仅包含季度和每季度患者人数的输出 table。

样本输出table

Qtr    year total   
   <chr> <dbl> <dbl>
 1 Q1'18  2018 485 
 2 Q2'18  2018 516 
 3 Q3'18  2018 560

这就是我在进行计数之前尝试创建带有四分之一列的 table 的内容，但出现错误：

df_1 <- df %>%
  mutate(quarter = map2(
    as.numeric(dt_diag_init),
    as.numeric(enroll),
    ~ format(seq(.x, .y, by="quarter"), "Q%q'%y")
  ))

采样日期：

df <- structure(list(patient_id = structure(c(5.13877678239481e-320, 
2.99902787682095e-319, 1.69811350606928e-318, 2.33298786097528e-318, 
2.41846121770582e-318, 2.75392685057557e-318, 4.23859411632851e-318, 
4.31912681660064e-318, 6.46782325102068e-318, 6.50981389026072e-318
), class = "integer64"), dt_diag_init = structure(c(17535, 18047, 
17870, 18260, 18156, 18184, 17533, 17532, 17966, 17757), class = "Date"), 
    enroll = structure(c(18535, 18169, 18535, 18535, 18230, 18535, 
    18535, 18535, 18535, 18535), class = "Date")), row.names = c(NA, 
-10L), class = c("tbl_df", "tbl", "data.frame"))

Answer 1

也许这有帮助

library(dplyr)
library(zoo)
library(tidyr)
library(purrr)
library(lubridate)
df %>%  
   mutate(quarter = map2(dt_diag_init, enroll, 
    ~ as.yearqtr(seq(.x, .y, by = 'quarter')))) %>%
   unnest(quarter) %>%
   mutate(year = year(quarter)) %>% 
   count(quarter, year, name = 'total')

-输出

# A tibble: 11 × 3
   quarter    year total
   <yearqtr> <dbl> <int>
 1 2018 Q1    2018     3
 2 2018 Q2    2018     3
 3 2018 Q3    2018     4
 4 2018 Q4    2018     5
 5 2019 Q1    2019     6
 6 2019 Q2    2019     7
 7 2019 Q3    2019     8
 8 2019 Q4    2019     8
 9 2020 Q1    2020     8
10 2020 Q2    2020     8
11 2020 Q3    2020     8

如何根据开始日期和结束日期计算季度内的观察次数？

How to count the number of observation that fall in quarter based on start and end date?

r

count

date-range

dplyr