如何根据另一列的模式创建一个组?

How to create a group based on pattern from another column?

我有如下数据框,

dt <- data.frame(id = c("a","b","c","d","e","f","g","h","i","j"),
                 value = c(1,2,1,2,1,1,1,2,1,2))

> dt
   id value
1   a     1
2   b     2
3   c     1
4   d     2
5   e     1
6   f     1
7   g     1
8   h     2
9   i     1
10  j     2

我希望根据列 value 创建一个列,以便每当它在列 中遇到 2 value 它将分配一个新的组号。输出看起来像,

dtgroup <- data.frame(id = c("a","b","c","d","e","f","g","h","i","j"),
                      value = c(1,2,1,2,1,1,1,2,1,2),
                      group = c(1,1,2,2,3,3,3,3,4,4))

> dtgroup
   id value group
1   a     1     1
2   b     2     1
3   c     1     2
4   d     2     2
5   e     1     3
6   f     1     3
7   g     1     3
8   h     2     3
9   i     1     4
10  j     2     4

有什么想法吗?谢谢!

cumsum,如果value没有NA

dt$group <- head(c(0,cumsum(dt$value==2))+1,-1)

dt

   id value group
1   a     1     1
2   b     2     1
3   c     1     2
4   d     2     2
5   e     1     3
6   f     1     3
7   g     1     3
8   h     2     3
9   i     1     4
10  j     2     4

我们可以像下面这样使用findInterval

> transform(dt, group = 1 + findInterval(seq_along(value), which(value == 2), left.open = TRUE))
   id value group
1   a     1     1
2   b     2     1
3   c     1     2
4   d     2     2
5   e     1     3
6   f     1     3
7   g     1     3
8   h     2     3
9   i     1     4
10  j     2     4

cut

> transform(dt, group = as.integer(cut(seq_along(value), c(-Inf, which(value == 2)))))
   id value group
1   a     1     1
2   b     2     1
3   c     1     2
4   d     2     2
5   e     1     3
6   f     1     3
7   g     1     3
8   h     2     3
9   i     1     4
10  j     2     4

另一种可能。当值为 1 且前一个值 (dplyr::lag) 不为 1 时加一。

dt$group <- with(dt, cumsum(value == 1 & dplyr::lag(value != 1, default = 1)))

   id value group
1   a     1     1
2   b     2     1
3   c     1     2
4   d     2     2
5   e     1     3
6   f     1     3
7   g     1     3
8   h     2     3
9   i     1     4
10  j     2     4