如何将单个计数分配给 R 中相邻的相同值组

Question

我有一列包含指示事件存在 (1) 或不存在 (0) 的二进制值。基于此列，我想创建一个包含连续计数的新列，该列将单个计数分配给相邻事件组。

event <- c(0,0,0,1,0,0,0,1,1,1,1,1,0,0,0,0,0,0,1,1,0,0)

count<- c(0,0,0,1,0,0,0,2,2,2,2,2,0,0,0,0,0,0,3,3,0,0)

df <- data.frame(event, count)

所需的计数应如下所示：

event   count
0   0
0   0
0   0
1   1
0   0
0   0
0   0
1   2
1   2
1   2
1   2
1   2
0   0
0   0
0   0
0   0
0   0
0   0
1   3
1   3
0   0
0   0

非常感谢任何关于如何到达那里的建议。谢谢！

Answer 1

对于 dplyr，以下检查是否有 1 跟在 0 之后，并对其求和。然后，将结果乘以 event 以保持零。

library(dplyr)

df %>% 
  mutate(count_2 = event * cumsum(event == 1 & lag(event, default = 0) == 0))

给予

   event count count_2
1      0     0       0
2      0     0       0
3      0     0       0
4      1     1       1
5      0     0       0
6      0     0       0
7      0     0       0
8      1     2       2
9      1     2       2
10     1     2       2
11     1     2       2
12     1     2       2
13     0     0       0
14     0     0       0
15     0     0       0
16     0     0       0
17     0     0       0
18     0     0       0
19     1     3       3
20     1     3       3
21     0     0       0
22     0     0       0

一个 base-R 变体：

df$count_2 <- df$event * cumsum(c(0, diff(df$event)==1))

Answer 2

在基数 R 中使用 rle：

df$count1 <- with(df, event * with(rle(event == 1),rep(cumsum(values), lengths)))
df

#   event count count1
#1      0     0      0
#2      0     0      0
#3      0     0      0
#4      1     1      1
#5      0     0      0
#6      0     0      0
#7      0     0      0
#8      1     2      2
#9      1     2      2
#10     1     2      2
#11     1     2      2
#12     1     2      2
#13     0     0      0
#14     0     0      0
#15     0     0      0
#16     0     0      0
#17     0     0      0
#18     0     0      0
#19     1     3      3
#20     1     3      3
#21     0     0      0
#22     0     0      0

如何将单个计数分配给 R 中相邻的相同值组

How to assign a single count to groups of adjacent identical values in R

r

counting