计算 R 中的新事件?

Compute New Occurrences in R?

我正在处理的数据提供了某个日期的多个组中的一个的总计数。我希望添加一个新列来计算该组的新计数,与上次记录日期相比。

例如,这是我拥有的数据的结构:

library(tidyverse)
library(lubridate)

df <- data.frame(date = as.Date(rep(c("2021-12-24","2021-12-27","2021-12-28","2021-12-31"),each=3),"%Y-%m-%d"), 
           group = rep(c("A","B","C"), 4),
           count = c(5,8,10,7,15,11,10,15,13,10,25,15))

提供以下内容:

date        group   count
2021-12-24  A       5       
2021-12-24  B       8       
2021-12-24  C       10      
2021-12-27  A       7       
2021-12-27  B       15      
2021-12-27  C       11      
2021-12-28  A       10      
2021-12-28  B       15      
2021-12-28  C       13      
2021-12-31  A       10
2021-12-31  B       25      
2021-12-31  C       15  

我希望能够产生具有以下价值的东西:

df2 <- data.frame(date = as.Date(rep(c("2021-12-24","2021-12-27","2021-12-28","2021-12-31"),each=3),"%Y-%m-%d"), 
           group = rep(c("A","B","C"), 4),
           count = c(5,8,10,7,15,11,10,15,13,10,25,15),
           new = c(0,0,0,2,7,1,3,0,2,0,10,2))

或者,

date        group   count   new
2021-12-24  A       5       0
2021-12-24  B       8       0 
2021-12-24  C       10      0
2021-12-27  A       7       2
2021-12-27  B       15      7
2021-12-27  C       11      1
2021-12-28  A       10      3
2021-12-28  B       15      0
2021-12-28  C       13      2
2021-12-31  A       10      0
2021-12-31  B       25      10
2021-12-31  C       15      2

我完全想不出有效的策略。非常感谢。

tidyverse

df2 <- data.frame(date = as.Date(rep(c("2021-12-24","2021-12-27","2021-12-28","2021-12-31"),each=3),"%Y-%m-%d"), 
                  group = rep(c("A","B","C"), 4),
                  count = c(5,8,10,7,15,11,10,15,13,10,25,15),
                  new = c(0,0,0,2,7,1,3,0,2,0,10,2))

library(tidyverse)
df2 %>% 
  arrange(date) %>% 
  group_by(group) %>% 
  mutate(new = c(0, diff(count))) %>% 
  ungroup()
#> # A tibble: 12 x 4
#>    date       group count   new
#>    <date>     <chr> <dbl> <dbl>
#>  1 2021-12-24 A         5     0
#>  2 2021-12-24 B         8     0
#>  3 2021-12-24 C        10     0
#>  4 2021-12-27 A         7     2
#>  5 2021-12-27 B        15     7
#>  6 2021-12-27 C        11     1
#>  7 2021-12-28 A        10     3
#>  8 2021-12-28 B        15     0
#>  9 2021-12-28 C        13     2
#> 10 2021-12-31 A        10     0
#> 11 2021-12-31 B        25    10
#> 12 2021-12-31 C        15     2

reprex package (v2.0.1)

于 2022-01-17 创建

data.table

library(data.table)
setDT(df2)[order(date), new := c(0, diff(count)), by = group][]
#>           date group count new
#>  1: 2021-12-24     A     5   0
#>  2: 2021-12-24     B     8   0
#>  3: 2021-12-24     C    10   0
#>  4: 2021-12-27     A     7   2
#>  5: 2021-12-27     B    15   7
#>  6: 2021-12-27     C    11   1
#>  7: 2021-12-28     A    10   3
#>  8: 2021-12-28     B    15   0
#>  9: 2021-12-28     C    13   2
#> 10: 2021-12-31     A    10   0
#> 11: 2021-12-31     B    25  10
#> 12: 2021-12-31     C    15   2

reprex package (v2.0.1)

于 2022-01-17 创建

基础

df2 <- df2[order(df2$date), ]

df2$new <- with(df2, ave(x = count, list(group), FUN = function(x) c(0, diff(x))))
df2
#>          date group count new
#> 1  2021-12-24     A     5   0
#> 2  2021-12-24     B     8   0
#> 3  2021-12-24     C    10   0
#> 4  2021-12-27     A     7   2
#> 5  2021-12-27     B    15   7
#> 6  2021-12-27     C    11   1
#> 7  2021-12-28     A    10   3
#> 8  2021-12-28     B    15   0
#> 9  2021-12-28     C    13   2
#> 10 2021-12-31     A    10   0
#> 11 2021-12-31     B    25  10
#> 12 2021-12-31     C    15   2

reprex package (v2.0.1)

于 2022-01-17 创建