获取过去 7 天内的唯一用户数

getting a unique count of users in the last 7 days

我有一个数据集,我想在其中找到过去 7 天(即过去 7 天)活跃的人。例如,

     date<- c('2009-01-03', '2009-01-03', '2009-01-03', '2009-01-04', '2009-01-05', '2009-02-01')
     person<- c('Abe', 'John', 'Abe', 'Kate', 'Jessica', 'Anu')
     df<- data.frame(date, person)

我想创建一个名为 last_seven_days_active 的列,该列采用过去 7 天内所有活跃用户的唯一计数。

     last_seven_days_active
           0
           0
           0
           2
           3
           0

我试过了。有什么建议么?

   library(zoo)
   df$last_seven_days_active <- rollsumr(df$person, k = 8, fill = NA)

一个base解决方案:

df$date <- as.Date(as.character(df$date))

df$last_seven_days_active <- with(df, sapply(date, function(x) length(unique(person[date >= x - 7 & date < x]))))

输出:

        date  person last_seven_days_active
1 2009-01-03     Abe                      0
2 2009-01-03    John                      0
3 2009-01-03     Abe                      0
4 2009-01-04    Kate                      2
5 2009-01-05 Jessica                      3
6 2009-02-01     Anu                      0

带有 betweenmap

的选项
library(dplyr)
library(purrr)
df %>%
    mutate(last_seven_days_active = map_dbl(as.Date(date), 
         ~ n_distinct(person[between(date, .x - 7, .x) & date != .x] )))
#       date  person last_seven_days_active
#1 2009-01-03     Abe                      0
#2 2009-01-03    John                      0
#3 2009-01-03     Abe                      0
#4 2009-01-04    Kate                      2
#5 2009-01-05 Jessica                      3
#6 2009-02-01     Anu                      0

使用data.table的选项:

library(data.table)
setDT(df)[, date := as.IDate(date, format="%Y-%m-%d")]
df[, days7ago := date - 7L]
df[, last_seven_days_active := 
    df[df, on=.(date>=days7ago, date<date), by=.EACHI, 
        length(unique(person[!is.na(person)]))]$V1
]

输出:

         date  person   days7ago last_seven_days_active
1: 2009-01-03     Abe 2008-12-27                      0
2: 2009-01-03    John 2008-12-27                      0
3: 2009-01-03     Abe 2008-12-27                      0
4: 2009-01-04    Kate 2008-12-28                      2
5: 2009-01-05 Jessica 2008-12-29                      3
6: 2009-02-01     Anu 2009-01-25                      0