使用 lead 和 dplyr 来计算两个时间戳之间的差异

Using lead with dplyr to compute the difference between two time stamps

我想找出两个时间戳之间的差异,方法是根据条件 "Start" 在一列中找到时间戳,然后在中找到满足另一个条件的第一行的时间戳同一列,"Stop"。基本上我们使用一个程序来 "start" 一个行为和 "stop" 一个行为,这样我们就可以计算行为的持续时间。

我已经尝试调整在这个 post 中找到的代码:

但我不知道如何让领导在同一列的后续行中满足条件。可以有 "event" 个具有 "start" 但没有 "stop" 的行为,这使情况变得复杂。示例数据框。

Data
Behavior             Modifier_1           Time_relative_s              
BodyLength           Start                122.11      
Growl                Start                129.70
Body Length          Stop                 132.26      
Body Length          Start                157.79      
Body Length          Stop                 258.85      
Body Length          Start                270.12    
Bark                 Start                272.26
Growl                Start                275.68
Body Length          Stop                 295.37

我想要这个:

Behavior             Modifier_1           Time_relative_s       diff             
BodyLength           Start                122.11                10.15
Growl                Start                129.70                 
Body Length          Stop                 132.26                
Body Length          Start                157.79                101.06  
Body Length          Stop                 258.85      
Body Length          Start                270.12                25.25    
Bark                 Start                272.26
Growl                Start                275.68
Body Length          Stop                 295.37

我试过使用 dplyr 管道:

test<-u%>%
    filter(Modifier_1 %in% c("Start","Stop")) %>%
    arrange(Time_Relative_s) %>%
    mutate(diff = lead(Time_Relative_s, default = first(Time_Relative_s=="Stop")-Time-Relative_s)

但我一定不能正确使用 lead,因为这只是 returns 差异列中对我来说的 Time_Relative_s。有什么建议么?感谢您的帮助!

我们可能需要根据'stop'的出现创建一个分组变量,然后得到第一个'Start'、[=19的位置对应的'Time_relative_s'的差值=] 'Modifier_1'

中的值
library(dplyr)
df1 %>% 
   group_by(grp = cumsum(lag(Modifier_1 == "Stop", default = FALSE))) %>% 
   mutate(diff = Time_relative_s[match("Stop", Modifier_1)] - 
                  Time_relative_s[match("Start", Modifier_1)], 
          diff = replace(diff, row_number() > 1, NA_real_)) %>%
   ungroup %>%
   select(-grp)
# A tibble: 9 x 4
#  Behavior    Modifier_1 Time_relative_s  diff
#  <chr>       <chr>                <dbl> <dbl>
#1 BodyLength  Start                 122.  10.1
#2 Growl       Start                 130.  NA  
#3 Body Length Stop                  132.  NA  
#4 Body Length Start                 158. 101. 
#5 Body Length Stop                  259.  NA  
#6 Body Length Start                 270.  25.2
#7 Bark        Start                 272.  NA  
#8 Growl       Start                 276.  NA  
#9 Body Length Stop                  295.  NA  

数据

df1 <- structure(list(Behavior = c("BodyLength", "Growl", "Body Length", 
"Body Length", "Body Length", "Body Length", "Bark", "Growl", 
"Body Length"), Modifier_1 = c("Start", "Start", "Stop", "Start", 
"Stop", "Start", "Start", "Start", "Stop"), Time_relative_s = c(122.11, 
129.7, 132.26, 157.79, 258.85, 270.12, 272.26, 275.68, 295.37
)), row.names = c(NA, -9L), class = "data.frame")