如何按组查找 R 中事件同时发生的日期?

How can I find the dates that events are happening concurrently in R by group?

在 R 中,我需要找到同时进行的治疗,并计算出当天的剂量是多少。我需要耐心地做这件事,所以大概在 dplyr.

中使用 group_by 语句
user_id treatment dosage treatment_start treatment_end
1 1 3 01/28/2019 07/30/2019
1 1 2 05/26/2019 11/25/2019
1 2 1 08/13/2019 02/12/2020
1 1 2 12/06/2019 04/07/2020
1 2 1 12/09/2019 06/10/2020

理想情况下,它的最终形式将是用户 ID、他们接受的治疗、所有治疗的剂量总和以及他们接受所有这些治疗的日期。我已经制作了一个示例结果 table,下面有几行。

user_id treatments total_dosage treatment_start treatment_end
1 1 3 01/28/2019 05/25/2019
1 1 5 05/26/2019 07/30/2019
1 1 2 07/31/2019 08/12/2019
1 1,2 3 08/13/2019 11/25/2019

我想出了如何查找一个事件是否与其他事件重叠,但它没有得到结果日期,也没有对剂量求和,所以我不知道它是否可用。在这种情况下,当然是治疗和剂量列的组合。

DF %>% group_by(user_id ) %>%
   mutate(overlap = purrr::map2_chr(treatment_start, treatment_end, 
              ~toString(course[.x >= treatment_start & .x < treatment_end| .y > treatment_start & .y < treatment_end]))) %>%
  ungroup()

这是一个有趣的问题。一种方法是将数据框扩展为每天一行,然后按日期汇总数据:

library(tidyverse)
library(lubridate)

dat %>% 
  # Convert dates to date format
  mutate(across(treatment_start:treatment_end, ~ mdy(.x))) %>% 
  # Expand the dataframe
  group_by(user_id, treatment_start, treatment_end) %>% 
  mutate(date = list(seq(treatment_start, treatment_end, by = "day"))) %>% 
  unnest(date) %>% 
  # Summarise by day
  group_by(user_id, date) %>% 
  summarise(dosage = sum(dosage),
            treatment = toString(unique(treatment))) %>% 
  # Summarise by different dosage (and create periods)
  group_by(user_id, treatment, dosage) %>% 
  summarise(treatment_start = min(date),
            treatment_ends = max(date)) %>% 
  arrange(treatment_start)

输出:

  user_id treatment dosage treatment_start treatment_ends
    <int> <chr>      <int> <date>          <date>        
1       1 1              3 2019-01-28      2019-05-25    
2       1 1              5 2019-05-26      2019-07-30    
3       1 1              2 2019-07-31      2019-08-12    
4       1 1, 2           3 2019-08-13      2020-04-07    
5       1 2              1 2019-11-26      2020-06-10    
6       1 2, 1           3 2019-12-06      2019-12-08    
7       1 2, 1           4 2019-12-09      2020-02-12