统计一个事件在过去1年内发生的次数

Count the number of occurrences of an event in the past 1 year

我正在尝试计算该行日期后 1 年内事件发生的次数。我已经计算出自上次活动以来的天数,但无法弄清楚如何继续,因为我需要回顾 365 天,而不仅仅是从最后一个虚拟人的日期开始。

只有当级别不是 NA 时我才需要计数,但这不是什么大问题。

编辑:

我又添加了 14 行来显示另一个级别实际上不是 NA 的示例。

经过jay.sf的帮助,目前的结果是:

第 33 行 = 1,但希望第 33 行 = 0,因为之前 1 年内没有出现过。

与第 37 行类似。

第 39 行 = 2 但之前只出现过 1 次,不考虑今天的。

因此我认为我需要更改代码,以便我只考虑在下一行计算该行的 dummyflag。

dtIhave2 = data.table(
  id = c(rep(1,17),rep(2,13),rep(3,14)),
  date = c(as.Date("2014-12-05"),
           as.Date("2015-01-23"),
           as.Date("2015-03-06"),
           as.Date("2015-05-15"),
           as.Date("2015-08-06"),
           as.Date("2015-10-29"),
           as.Date("2016-01-21"),
           as.Date("2016-04-06"),
           as.Date("2016-07-11"),
           as.Date("2016-10-03"),
           as.Date("2016-11-11"),
           as.Date("2016-12-07"),
           as.Date("2017-10-25"),
           as.Date("2018-01-09"),
           as.Date("2018-02-12"),
           as.Date("2018-07-04"),
           as.Date("2018-11-30"),
           as.Date("2014-05-14"),
           as.Date("2014-09-03"),
           as.Date("2014-09-04"),
           as.Date("2014-10-15"),
           as.Date("2014-11-08"),
           as.Date("2014-12-05"),
           as.Date("2014-12-18"),
           as.Date("2014-12-20"),
           as.Date("2014-12-23"),
           as.Date("2015-05-15"),
           as.Date("2015-08-19"),
           as.Date("2016-06-23"),
           as.Date("2017-04-21"),
           
           as.Date("2015-01-03"),
           as.Date("2015-02-13"),
           as.Date("2015-06-01"),#
           as.Date("2015-09-05"),
           as.Date("2015-12-01"),
           as.Date("2016-06-10"),
           as.Date("2016-10-16"),#
           as.Date("2016-12-15"),
           as.Date("2017-04-30"),#
           as.Date("2017-06-23"),
           as.Date("2017-10-01"),
           as.Date("2017-12-01"),
           as.Date("2018-03-10"),
           as.Date("2018-06-02")
           
           ),
  
  level = c(rnorm(10,7,1),
            NA,
            rnorm(9,7,1),
            NA,NA,
            7,
            NA,NA,NA,
            rnorm(4,7,1),
            rnorm(14,7,1)), 
  
  dummyflag = c(rep(0 ,10), 
                1,
                rep(0,9),
                1,
                1,
                0,
                1,
                1,
                1,
                rep(0,4),
                rep(0,2),
                1,
                rep(0,3),
                1,
                rep(0,1),
                1,
                rep(0,5)),
  
  dayssincedummy = c(rep(NA,11),
                     26,348,424,458,600,749,
                     rep(NA,4),
                     24,27,40,2,3,143,239,548,850,
                     rep(NA,3),
                     96,
                     183,
                     375,
                     503,
                     60,
                     196,
                     54,
                     154,
                     215,
                     314,
                     398
                     
                     )
)

dtIhave2$within1yr = sapply(seq_len(nrow(dtIhave2)),function(i) dtIhave2[date %between% rev(seq.Date(date[i], length.out=2, by='-1 year')) & id == id[i] & !is.na(level[i]), sum(dummyflag %in% 1)])

> dtIhave2
    id       date    level dummyflag dayssincedummy within1yr
 1:  1 2014-12-05 7.977480         0             NA         0
 2:  1 2015-01-23 7.589833         0             NA         0
 3:  1 2015-03-06 7.301062         0             NA         0
 4:  1 2015-05-15 6.739734         0             NA         0
 5:  1 2015-08-06 5.682534         0             NA         0
 6:  1 2015-10-29 6.659627         0             NA         0
 7:  1 2016-01-21 7.159197         0             NA         0
 8:  1 2016-04-06 9.957324         0             NA         0
 9:  1 2016-07-11 6.607859         0             NA         0
10:  1 2016-10-03 7.093568         0             NA         0
11:  1 2016-11-11       NA         1             NA         0
12:  1 2016-12-07 5.527618         0             26         1
13:  1 2017-10-25 6.055255         0            348         1
14:  1 2018-01-09 6.031328         0            424         0
15:  1 2018-02-12 5.875067         0            458         0
16:  1 2018-07-04 6.875352         0            600         0
17:  1 2018-11-30 8.439167         0            749         0
18:  2 2014-05-14 7.381595         0             NA         0
19:  2 2014-09-03 7.325306         0             NA         0
20:  2 2014-09-04 8.101320         0             NA         0
21:  2 2014-10-15       NA         1             NA         0
22:  2 2014-11-08       NA         1             24         0
23:  2 2014-12-05 7.000000         0             27         2
24:  2 2014-12-18       NA         1             40         0
25:  2 2014-12-20       NA         1              2         0
26:  2 2014-12-23       NA         1              3         0
27:  2 2015-05-15 7.211657         0            143         5
28:  2 2015-08-19 7.274550         0            239         5
29:  2 2016-06-23 7.216593         0            548         0
30:  2 2017-04-21 6.516086         0            850         0
31:  3 2015-01-03 7.945201         0             NA         0
32:  3 2015-02-13 8.417933         0             NA         0
33:  3 2015-06-01 9.290180         1             NA         1
34:  3 2015-09-05 8.400137         0             96         1
35:  3 2015-12-01 8.115692         0            183         1
36:  3 2016-06-10 7.322929         0            375         0
37:  3 2016-10-16 4.946102         1            503         1
38:  3 2016-12-15 9.435223         0             60         1
39:  3 2017-04-30 6.671779         1            196         2
40:  3 2017-06-23 6.994869         0             54         2
41:  3 2017-10-01 7.540090         0            154         2
42:  3 2017-12-01 7.332589         0            215         1
43:  3 2018-03-10 7.779732         0            314         1
44:  3 2018-06-02 6.068338         0            398         0
    id       date    level dummyflag dayssincedummy within1yr
> dtIwant2
    id       date    level dummyflag dayssincedummy within1yr
 1:  1 2014-12-05 7.977480         0             NA         0
 2:  1 2015-01-23 7.589833         0             NA         0
 3:  1 2015-03-06 7.301062         0             NA         0
 4:  1 2015-05-15 6.739734         0             NA         0
 5:  1 2015-08-06 5.682534         0             NA         0
 6:  1 2015-10-29 6.659627         0             NA         0
 7:  1 2016-01-21 7.159197         0             NA         0
 8:  1 2016-04-06 9.957324         0             NA         0
 9:  1 2016-07-11 6.607859         0             NA         0
10:  1 2016-10-03 7.093568         0             NA         0
11:  1 2016-11-11       NA         1             NA         0
12:  1 2016-12-07 5.527618         0             26         1
13:  1 2017-10-25 6.055255         0            348         1
14:  1 2018-01-09 6.031328         0            424         0
15:  1 2018-02-12 5.875067         0            458         0
16:  1 2018-07-04 6.875352         0            600         0
17:  1 2018-11-30 8.439167         0            749         0
18:  2 2014-05-14 7.381595         0             NA         0
19:  2 2014-09-03 7.325306         0             NA         0
20:  2 2014-09-04 8.101320         0             NA         0
21:  2 2014-10-15       NA         1             NA         0
22:  2 2014-11-08       NA         1             24         0
23:  2 2014-12-05 7.000000         0             27         2
24:  2 2014-12-18       NA         1             40         0
25:  2 2014-12-20       NA         1              2         0
26:  2 2014-12-23       NA         1              3         0
27:  2 2015-05-15 7.211657         0            143         5
28:  2 2015-08-19 7.274550         0            239         5
29:  2 2016-06-23 7.216593         0            548         0
30:  2 2017-04-21 6.516086         0            850         0
31:  3 2015-01-03 7.945201         0             NA         0
32:  3 2015-02-13 8.417933         0             NA         0
33:  3 2015-06-01 9.290180         1             NA         0
34:  3 2015-09-05 8.400137         0             96         1
35:  3 2015-12-01 8.115692         0            183         1
36:  3 2016-06-10 7.322929         0            375         0
37:  3 2016-10-16 4.946102         1            503         0
38:  3 2016-12-15 9.435223         0             60         1
39:  3 2017-04-30 6.671779         1            196         1
40:  3 2017-06-23 6.994869         0             54         2
41:  3 2017-10-01 7.540090         0            154         2
42:  3 2017-12-01 7.332589         0            215         1
43:  3 2018-03-10 7.779732         0            314         1
44:  3 2018-06-02 6.068338         0            398         0
    id       date    level dummyflag dayssincedummy within1yr


输入乱码,因为我的问题中的代码太多:ajksdfcksjadbf jklsdaakjsdhfkjsdhafkajsdfasdf ASD 加气 fgasdfsadfasdfas dfa dfas dfasd fa

尝试使用 seq.Date'-1 year' 并使用 sapply.

遍历行
library(data.table)
sapply(seq_len(nrow(dtIhave)), \(i) 
       dtIhave[date %between% rev(seq.Date(date[i], length.out=2, by='-1 year')) & 
                 id == id[i] & !is.na(level[i]), sum(dummyflag %in% 1)])
# [1] 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 2 0 0 0 5 5 0 0

您也可以使用 '-365 days',但 '-1 year' 也会考虑闰年。

编辑

对于更新后的案例,将 !is.na(level[i]) 替换为 ummyflag[i] == 0

sapply(seq_len(nrow(dtIhave2)), \(i) 
       dtIhave2[date %between% rev(seq.Date(date[i], length.out=2, by='-1 year')) & 
                 id == id[i] & dummyflag[i] == 0, sum(dummyflag %in% 1)])
# [1] 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 2 0 0 0 5 5 0 0 0 0 0 1 1 0 0 1 0 2 2 1 1 0

解决方案:

dtIhave[, res:=fifelse(
  is.na(level),0,
  dtIhave[id==.BY$id & between(date,(.BY$date-365), .BY$date) & dummyflag==1, .N]),
  , by=.(id,date)
]

解释:

这使用一个简单的 If-else 语句(使用 data.table 的 fifelse()),但每个 iddate 组都这样做。

  • 如果级别为NA,则结果列为0。
  • 如果级别不是 NA,那么我们只需将 dtIhave 过滤到具有此 id (id==.BY$id) 的行,并且日期介于此日期减去 365 ( .BY$date-365) 和这个日期 (.BY$date),然后我们使用 .N.
  • 计算这些行

特殊的 .BY 可以在 j 中使用;它在列表中保存 by 列的值。即。 .BY$id保存id的值,.BY$date保存当前组date的值

输出:

    id       date     level dummyflag dayssincedummy res
 1:  1 2014-12-05  6.831267        NA             NA   0
 2:  1 2015-01-23  7.167449        NA             NA   0
 3:  1 2015-03-06  6.500918        NA             NA   0
 4:  1 2015-05-15  7.267101        NA             NA   0
 5:  1 2015-08-06  6.463343        NA             NA   0
 6:  1 2015-10-29  7.685856        NA             NA   0
 7:  1 2016-01-21  6.465524        NA             NA   0
 8:  1 2016-04-06  7.602419        NA             NA   0
 9:  1 2016-07-11  7.339648        NA             NA   0
10:  1 2016-10-03  6.049635        NA             NA   0
11:  1 2016-11-11        NA         1             NA   0
12:  1 2016-12-07  6.639634        NA             26   1
13:  1 2017-10-25  7.951767        NA            348   1
14:  1 2018-01-09  5.444352        NA            424   0
15:  1 2018-02-12  8.972908        NA            458   0
16:  1 2018-07-04  7.084616        NA            600   0
17:  1 2018-11-30  5.602063        NA            749   0
18:  2 2014-05-14  7.120637        NA             NA   0
19:  2 2014-09-03  7.260747        NA             NA   0
20:  2 2014-09-04  7.676648        NA             NA   0
21:  2 2014-10-15        NA         1             NA   0
22:  2 2014-11-08        NA         1             24   0
23:  2 2014-12-05  7.000000        NA             27   2
24:  2 2014-12-18        NA         1             40   0
25:  2 2014-12-20        NA         1              2   0
26:  2 2014-12-23        NA         1              3   0
27:  2 2015-05-15  6.137783        NA            143   5
28:  2 2015-08-19  7.088102        NA            239   5
29:  2 2016-06-23  7.620440        NA            548   0
30:  2 2017-04-21 10.325672        NA            850   0