R 行之间的时间差

R Difference in time between rows

我已经对以下代码的其他 SO 答案的信息进行了三角测量,但遇到了一条错误消息。在 SO 中搜索了类似的错误和解决方案,但未能弄清楚,因此不胜感激。

对于每个组 ("id"),我想获得连续行的开始时间之间的差异。

可重现的数据:

require(dplyr)
df <-data.frame(id=as.numeric(c("1","1","1","2","2","2")), 
            start= c("1/31/17 10:00","1/31/17 10:02","1/31/17 10:45", 
                             "2/10/17 12:00", "2/10/17 12:20","2/11/17 09:40"))
time <- strptime(df$start, format = "%m/%d/%y %H:%M")
df %>%
group_by(id)%>%
mutate(diff = time - lag(time),
     diff_mins = as.numeric(diff, units = 'mins'))

让我出错:

Error in mutate_impl(.data, dots) : Column diff must be length 3 (the group size) or one, not 6 In addition: Warning message: In unclass(time1) - unclass(time2) : longer object length is not a multiple of shorter object length

你的意思是这样的吗?

这里不需要lag,分组的time上简单的diff就够了。

df %>%
    mutate(start = as.POSIXct(start, format = "%m/%d/%y %H:%M")) %>%
    group_by(id) %>%
    mutate(diff = c(0, diff(start)))
## A tibble: 6 x 3
## Groups:   id [2]
#     id start                diff
#  <dbl> <dttm>              <dbl>
#1    1. 2017-01-31 10:00:00    0.
#2    1. 2017-01-31 10:02:00    2.
#3    1. 2017-01-31 10:45:00   43.
#4    2. 2017-02-10 12:00:00    0.
#5    2. 2017-02-10 12:20:00   20.
#6    2. 2017-02-11 09:40:00 1280.

您可以使用 lagdifftime(根据 Hadley):

df %>%
  mutate(time = as.POSIXct(start, format = "%m/%d/%y %H:%M")) %>%
  group_by(id) %>%
  mutate(diff = difftime(time, lag(time)))

# A tibble: 6 x 4
# Groups:   id [2]
     id start         time                diff  
  <dbl> <fct>         <dttm>              <time>
1    1. 1/31/17 10:00 2017-01-31 10:00:00 <NA>  
2    1. 1/31/17 10:02 2017-01-31 10:02:00 2     
3    1. 1/31/17 10:45 2017-01-31 10:45:00 43    
4    2. 2/10/17 12:00 2017-02-10 12:00:00 <NA>  
5    2. 2/10/17 12:20 2017-02-10 12:20:00 20    
6    2. 2/11/17 09:40 2017-02-11 09:40:00 1280