用重叠和串联的时间间隔定义 periods/episodes 的说明

Define periods/episodes of exposition with overlaping and concatenated intervals of time

我正在尝试确定 periods/episodes 对处方药的解释。如果这些处方分开 30 天,则视为新的 period/episode 说明。处方可以在特定时间重叠或连续。如果连续两次处方的间隔天数之和大于 30 天,则不视为新发作。

我有这样的数据:

id = c(rep(1,3), rep(2,6), rep(3,5))
start = as.Date(c("2017-05-10", "2017-07-28", "2017-11-23", "2017-01-27", "2017-10-02", "2018-05-14", "2018-05-25", "2018-11-26", "2018-12-28", "2016-01-01", "2016-03-02", "2016-03-20", "2016-04-25", "2016-06-29"))
end = as.Date(c("2017-07-27", "2018-01-28", "2018-03-03", "2017-04-27", "2018-05-13", "2018-11-14", "2018-11-25", "2018-12-27", "2019-06-28", "2016-02-15", "2016-03-05", "2016-03-24", "2016-04-29", "2016-11-01"))

DT = data.table(id, start, end)

DT
    id      start        end
 1:  1 2017-05-10 2017-07-27
 2:  1 2017-07-28 2018-01-28
 3:  1 2017-11-23 2018-03-03
 4:  2 2017-01-27 2017-04-27
 5:  2 2017-10-02 2018-05-13
 6:  2 2018-05-14 2018-11-14
 7:  2 2018-05-25 2018-11-25
 8:  2 2018-11-26 2018-12-27
 9:  2 2018-12-28 2019-06-28
10:  3 2016-01-01 2016-02-15
11:  3 2016-03-02 2016-03-05
12:  3 2016-03-20 2016-03-24
13:  3 2016-04-25 2016-04-29
14:  3 2016-06-29 2016-11-01

我计算了开始和最后结束观察的差异(last_diffdays)

DT[, last_diffdays := start-shift(end, n=1L), by = .(id)][is.na(last_diffdays), last_diffdays := 0][]

    id      start        end last_diffdays
 1:  1 2017-05-10 2017-07-27        0 days
 2:  1 2017-07-28 2018-01-28        1 days
 3:  1 2017-11-23 2018-03-03      -66 days
 4:  2 2017-01-27 2017-04-27        0 days
 5:  2 2017-10-02 2018-05-13      158 days
 6:  2 2018-05-14 2018-11-14        1 days
 7:  2 2018-05-25 2018-11-25     -173 days
 8:  2 2018-11-26 2018-12-27        1 days
 9:  2 2018-12-28 2019-06-28        1 days
10:  3 2016-01-01 2016-02-15        0 days
11:  3 2016-03-02 2016-03-05       16 days
12:  3 2016-03-20 2016-03-24       15 days
13:  3 2016-04-25 2016-04-29       32 days
14:  3 2016-06-29 2016-11-01       61 days 

这显示何时发生重叠(负值)或不发生重叠(正值)。我认为 ifelse/fcase 在这里声明是个坏主意,我不太愿意这样做。

我认为这项工作的良好输出应该是这样的:

    id      start        end last_diffdays noexp_days period
 1:  1 2017-05-10 2017-07-27        0 days          0      1
 2:  1 2017-07-28 2018-01-28        1 days          1      1
 3:  1 2017-11-23 2018-03-03      -66 days          0      1
 4:  2 2017-01-27 2017-04-27        0 days          0      1
 5:  2 2017-10-02 2018-05-13      158 days        158      2
 6:  2 2018-05-14 2018-11-14        1 days          1      2
 7:  2 2018-05-25 2018-11-25     -173 days          0      2
 8:  2 2018-11-26 2018-12-27        1 days          1      2
 9:  2 2018-12-28 2019-06-28        1 days          1      2
10:  3 2016-01-01 2016-02-15        0 days          0      1
11:  3 2016-03-02 2016-03-05       16 days         16      1
12:  3 2016-03-20 2016-03-24       15 days         15      1
13:  3 2016-04-25 2016-04-29       32 days         32      2
14:  3 2016-06-29 2016-11-01       61 days         61      3

我手动计算了之前处方没有说明的天数(noexp_days)。

我不知道我的路径是否正确,但我认为我需要计算 noexp_days 变量,然后生成一个 cumsum((noexp_days)>30)+1.

如果有我没有看到的更好的解决方案或我没有考虑过的任何其他可能性,我将不胜感激阅读它们。

在此先感谢您的帮助! :)

尝试:

library(data.table)

DT[, noexp_days := pmax(as.integer(last_diffdays), 0)]
DT[, period := cumsum(noexp_days > 30) + 1, id]
DT
#    id      start        end last_diffdays noexp_days period
# 1:  1 2017-05-10 2017-07-27        0 days          0      1
# 2:  1 2017-07-28 2018-01-28        1 days          1      1
# 3:  1 2017-11-23 2018-03-03      -66 days          0      1
# 4:  2 2017-01-27 2017-04-27        0 days          0      1
# 5:  2 2017-10-02 2018-05-13      158 days        158      2
# 6:  2 2018-05-14 2018-11-14        1 days          1      2
# 7:  2 2018-05-25 2018-11-25     -173 days          0      2
# 8:  2 2018-11-26 2018-12-27        1 days          1      2
# 9:  2 2018-12-28 2019-06-28        1 days          1      2
#10:  3 2016-01-01 2016-02-15        0 days          0      1
#11:  3 2016-03-02 2016-03-05       16 days         16      1
#12:  3 2016-03-20 2016-03-24       15 days         15      1
#13:  3 2016-04-25 2016-04-29       32 days         32      2
#14:  3 2016-06-29 2016-11-01       61 days         61      3