R - 有没有办法在重复观察后生成从 1 开始的行号?

R - Is there a way to generate row numbers that start back from 1 after repeated upcoming observation?

enter image description here标题肯定看不懂我的问题。这是我的简短数据:

|ID | group | 
|---|-------|
| 1 | Banana| 
| 2 | Apple | 
| 3 | Apple | 
| 4 | Apple | 
| 5 | Banana| 
| 6 | Banana| 
| 7 | Apple | 
| 8 | Apple | 

现在我想创建一个按组编号的变量,但是它不应该在新的观察后再次从 1 开始。所以基本上它看起来像这样:

|ID | group | row_number |
|---|-------|------------|
| 1 | Banana| 1          |
| 2 | Apple | 1          |
| 3 | Apple | 2          |
| 4 | Apple | 3          | 
| 5 | Banana| 2          |
| 6 | Banana| 3          | 
| 7 | Apple | 4          |
| 8 | Apple | 5          |

什么时候应该是这样的:

|ID | group | row_number |
|---|-------|------------|
| 1 | Banana| 1          |
| 2 | Apple | 1          |
| 3 | Apple | 2          |
| 4 | Apple | 3          | 
| 5 | Banana| 1          |
| 6 | Banana| 2          | 
| 7 | Apple | 1          |
| 8 | Apple | 2          |

我不得不提的是,我有很多观察结果,而不仅仅是苹果和香蕉这两个群体。因此,不幸的是,我必须在其中命名“Apple”和“Banana”等组的代码没有帮助。我试过这样解决问题:

df1<- df1%>%   
  group_by(group) %>%
  mutate(numbering = row_number())

但是这里的错误很明显。我也尝试解决这个问题,但是非常困难。如果有人有解决方案,我将不胜感激!

这里有 3 种方法 -

基础 R -

df <- transform(df, row_number = ave(ID, with(rle(group), 
                 rep(seq_along(values), lengths)), FUN = seq_along))
df

#  ID  group row_number
#1  1 Banana          1
#2  2  Apple          1
#3  3  Apple          2
#4  4  Apple          3
#5  5 Banana          1
#6  6 Banana          2
#7  7  Apple          1
#8  8  Apple          2

dplyr -

library(dplyr)

df %>%
  group_by(grp = cumsum(group != lag(group, default = first(group)))) %>%
  mutate(row_number = row_number()) %>%
  ungroup %>%
  select(-grp)

data.table -

library(data.table)

setDT(df)[, row_number := seq_len(.N), rleid(group)]

数据

df <- structure(list(ID = 1:8, group = c("Banana", "Apple", "Apple", 
"Apple", "Banana", "Banana", "Apple", "Apple")), row.names = c(NA, 
-8L), class = "data.frame")

另一种方式:

df %>% 
  mutate(Temp=data.table::rleid(group)) %>% 
  group_by(Temp) %>% 
  mutate(row_number=row_number()) %>%
  select(-Temp)