添加时间间隔的无暴露周期(间隙)的行

Add rows of no exposed periods (gaps) of a time interval

我正在尝试为 Self-Controlled Study 格式化数据。我有两个日期内的学习期:

cohort_start = as.Date("01/01/2015", "%d/%m/%Y")
cohort_end  = as.Date("01/01/2020", "%d/%m/%Y")

我有不同时期接触药物的 id:

id = c(rep(1,2),2)
start = as.Date(c("10/09/2015","22/04/2018", "08/06/2017"), "%d/%m/%Y" )
end = as.Date(c("31/01/2016","17/02/2019", "03/11/2018"), "%d/%m/%Y" )
exp = rep(1,3)

DT = data.table(id, start, end, exp)

DT
   id      start        end exp
1:  1 2015-09-10 2016-01-31   1
2:  1 2018-04-22 2019-02-17   1
3:  2 2017-06-08 2018-11-03   1

我想根据 gaps/periods 添加观察结果,如果 id 没有像期望的输出那样接触药物:

   id      start        end exp
1:  1 2015-01-01 2015-09-09   0
2:  1 2015-09-10 2016-01-31   1
3:  1 2016-02-01 2018-04-21   0
4:  1 2018-04-22 2019-02-17   1
5:  1 2019-02-18 2020-01-01   0
6:  2 2015-01-01 2018-06-07   0
7:  2 2017-06-08 2018-11-03   1
8:  2 2018-11-04 2020-01-01   0

我现在没有任何线索...

如有任何帮助,我们将不胜感激,

提前致谢!

# Create data.table with all dates in period 2015-01-01 >> 2020-01-01
# for each id
DT.all <- CJ(id   = unique(DT$id), 
             date = seq( as.Date("2015-01-01"), as.Date("2020-01-01"), by = 1))
# Join in data
DT.all[DT, exp := i.exp, on = .(id, date >= start, date <= end)]
# Create groups
DT.all[, group := rleid(id,exp)]
# Summarise by id and just created groups
ans <- DT.all[, .(start = min(date), end = max(date), exp = unique(exp)), by = .(id,group)]
# Replace NA-=exp with 0
ans[is.na(exp), exp := 0][]
#    id group      start        end exp
# 1:  1     1 2015-01-01 2015-09-09   0
# 2:  1     2 2015-09-10 2016-01-31   1
# 3:  1     3 2016-02-01 2018-04-21   0
# 4:  1     4 2018-04-22 2019-02-17   1
# 5:  1     5 2019-02-18 2020-01-01   0
# 6:  2     6 2015-01-01 2017-06-07   0
# 7:  2     7 2017-06-08 2018-11-03   1
# 8:  2     8 2018-11-04 2020-01-01   0