R:使用 dplyr 根据值变化添加新变量

R: add new variable based on value change using dplyr

我有如下数据集:

id <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2)
v1 <- c("A","A","A","A","B","B","B","B","B","A","A","A","A","A","A","A","A","B","B","B","B","B","A","A","A","A")  
v2 <- c(1,1,1,2,2,2,2,2,2,2,3,3,3,1,1,1,2,2,2,2,2,2,2,3,3,3)
mydata<- data.frame(id,v1,v2)

v1表示移动与否,条件A=移动,B=不移动,v2是每个id经过的位置序列,1->2->3。我想添加一个列,说明是否:id 从 1 移动到 2,id 保持在 2,id 从 2 移动到 3。

具体来说,我尝试根据以下条件向数据框添加一列:if v1=A (id moving) and v2=1 or (v2=2 if previously v2 had been 1) phase= "去”,如果 v1=B(不动)阶段“停留”并且如果 v1=A(移动)和 v2=3 或(v2=2 在“停留”阶段之后)阶段 =“返回”。

具体来说,我的预期输出是:

phase <- c("Going","Going","Going","Going","Staying","Staying","Staying","Staying","Staying","Returning","Returning","Returning","Returning","Going","Going","Going","Going","Staying","Staying","Staying","Staying","Staying","Returning","Returning","Returning","Returning")

mydata2 <- cbind(mydata, phase)

我尝试了以下方法:

mydata <-mydata %>% group_by(id)%>% mutate(phase= case_when(v1 == "A" & (v2 == 1 | v2==2) ~ "Going", phase == "A" & (v2 == 2 | v2==3) ~ "Returning", phase == "B" ~ "Staying"))

但是 (v2 == 1 | v2==2) 不考虑值的顺序。有任何想法吗?如果听起来很复杂,我们深表歉意,很高兴进一步澄清。

按'id'分组后,根据相邻元素是否相同(rleid)在'v1'上创建一个分组,然后使用case_when,如果'v1' 值为“B”,return 'Staying',如果 'v1' 为 'A' 且 'grp' 为 1,则 'Going' 且如果 'grp' 大于 1,则 'Returning'

library(data.table)
library(dplyr)
mydata %>% 
   group_by(id) %>%
   mutate(grp = rleid(v1), phase = case_when(v1 == 'B' ~ 'Staying',
        v1 == 'A' & grp == 1 ~ 'Going', v1 == 'A' &
          grp > 1 ~ 'Returning')) %>%
   ungroup %>%
   select(-grp)