使用 ISO 8601 数据格式计算 R 中的时差

Calculating time difference in R with ISO 8601 data format

我有一个包含两列的数据框,一列用于开始时间,另一列用于结束时间。我想以分钟(64 行)为单位计算每一行的持续时间。我无法让它与 difftime 一起使用,我相信这是因为每个条目的格式都是 ISO 8601 时间格式。

这是我尝试添加差异列的方法:

recordingdata %>%
  mutate(difference = difftime(recordingdata$end, recordingdata$start))

这给我一个错误说明如下:

> recordingdata %>%
+   mutate(difference = difftime(recordingdata$end, recordingdata$start))
Error: Problem with `mutate()` column `difference`.
i `difference = difftime(recordingdata$end, recordingdata$start)`.
x do not know how to convert 'time1' to class “POSIXct”

有谁知道如何使用 ISO 8601 格式计算 R 中的时差?

作为参考,我已经包含了 dput()

> dput(recordingdata)
structure(list(start = list("2018-10-04T14:00:12.9573672-04:00", 
    "2018-10-25T11:05:29.8308453-04:00", "2018-10-11T11:04:04.9321231-04:00", 
    "2019-01-24T14:02:47.2197654-05:00", "2019-01-16T15:31:36.3436349-05:00", 
    "2018-11-27T11:04:25.7652995-05:00", "2018-09-27T13:59:20.7167538-04:00", 
    "2018-10-23T11:03:57.2951439-04:00", "2019-01-25T15:30:05.5208913-05:00", 
    "2018-10-23T14:00:00.8628457-04:00", "2018-11-01T14:02:13.1633914-04:00", 
    "2019-02-08T15:30:28.7534989-05:00", "2019-02-05T14:03:00.3375584-05:00", 
    "2018-09-25T13:58:40.1592663-04:00", "2019-02-19T14:01:14.3345036-05:00", 
    "2018-09-18T13:58:45.8510094-04:00", "2019-02-21T14:02:45.150166-05:00", 
    "2018-10-18T14:01:02.6088515-04:00", "2018-09-25T11:05:31.8207547-04:00", 
    "2018-10-25T14:00:45.1210745-04:00", "2019-02-06T15:31:15.3336905-05:00", 
    "2019-02-26T14:03:22.9071858-05:00", "2019-02-01T15:30:39.2000082-05:00", 
    "2019-01-28T15:34:57.1529907-05:00", "2018-11-01T11:02:41.135301-04:00", 
    "2018-11-29T11:03:45.6709703-05:00", "2018-09-13T11:02:01.3779116-04:00", 
    "2018-11-13T14:04:02.8245491-05:00", "2018-10-09T13:59:04.277118-04:00", 
    "2019-01-18T15:30:07.1329401-05:00", "2018-10-16T14:00:07.6894384-04:00", 
    "2019-01-09T15:30:39.088267-05:00", "2018-10-02T11:04:09.4680471-04:00", 
    "2019-01-29T14:02:29.8846692-05:00", "2018-10-11T14:01:07.3154994-04:00", 
    "2018-10-18T11:03:26.459923-04:00", "2019-01-31T14:02:41.3065397-05:00", 
    "2018-10-09T11:15:11.9923903-04:00", "2018-10-02T13:59:29.7862154-04:00", 
    "2019-02-14T14:02:41.9698094-05:00", "2018-11-29T14:02:36.501325-05:00", 
    "2019-01-07T15:30:33.7323249-05:00", "2018-09-20T14:00:11.5101096-04:00", 
    "2018-09-20T11:01:41.3236546-04:00", "2019-02-28T14:02:12.8318035-05:00", 
    "2019-01-30T15:30:27.186557-05:00", "2019-01-11T15:30:30.9134512-05:00", 
    "2019-02-12T14:01:59.4161106-05:00", "2019-01-22T14:00:18.5297617-05:00", 
    "2018-09-27T11:02:11.2636149-04:00", "2018-10-16T11:03:18.0557298-04:00", 
    "2018-09-18T11:01:56.1963404-04:00", "2018-10-04T11:02:53.3520421-04:00", 
    "2019-01-14T15:30:19.5026063-05:00", "2018-09-13T13:59:31.9208709-04:00", 
    "2018-11-15T14:02:09.9181527-05:00", "2018-10-30T11:04:19.95311-04:00", 
    "2019-02-11T15:30:06.3869495-05:00", "2018-11-27T14:00:33.7035384-05:00", 
    "2018-10-30T13:59:52.338132-04:00", "2018-11-15T11:02:34.4268946-05:00", 
    "2018-11-13T11:06:32.3980301-05:00", "2019-01-23T15:30:25.7803475-05:00", 
    "2019-02-04T15:32:11.1189642-05:00"), end = list("2018-10-04T15:18:07.8094323-04:00", 
    "2018-10-25T12:16:14.0266186-04:00", "2018-10-11T12:13:38.1115286-04:00", 
    "2019-01-24T15:16:50.0124679-05:00", "2019-01-16T16:21:15.8140362-05:00", 
    "2018-11-27T12:16:36.4015273-05:00", "2018-09-27T15:16:23.496883-04:00", 
    "2018-10-23T12:18:01.4381006-04:00", "2019-01-25T16:24:26.6804509-05:00", 
    "2018-10-23T15:16:52.1525905-04:00", "2018-11-01T15:17:49.7126424-04:00", 
    "2019-02-08T16:22:27.4863968-05:00", "2019-02-05T15:14:28.9686029-05:00", 
    "2018-09-25T15:17:44.9385262-04:00", "2019-02-19T15:18:59.8835244-05:00", 
    "2018-09-18T15:17:51.1574075-04:00", "2019-02-21T15:13:51.2819396-05:00", 
    "2018-10-18T15:16:12.3872796-04:00", "2018-09-25T12:15:31.357178-04:00", 
    "2018-10-25T15:13:41.8855028-04:00", "2019-02-06T16:22:43.7091322-05:00", 
    "2019-02-26T15:15:36.6594403-05:00", "2019-02-01T16:21:50.39962-05:00", 
    "2019-01-28T16:25:23.3835141-05:00", "2018-11-01T12:18:40.9399271-04:00", 
    "2018-11-29T12:13:23.9690959-05:00", "2018-09-13T12:15:17.3662996-04:00", 
    "2018-11-13T15:18:22.8184422-05:00", "2018-10-09T15:12:53.6660084-04:00", 
    "2019-01-18T16:20:54.8697082-05:00", "2018-10-16T15:17:08.8380856-04:00", 
    "2019-01-09T16:26:00.0458205-05:00", "2018-10-02T12:10:00.1000249-04:00", 
    "2019-01-29T15:14:28.8572584-05:00", "2018-10-11T15:14:45.9939861-04:00", 
    "2018-10-18T12:21:44.0505845-04:00", "2019-01-31T15:17:36.8727957-05:00", 
    "2018-10-09T12:14:49.6801177-04:00", "2018-10-02T15:16:27.0662622-04:00", 
    "2019-02-14T15:16:15.4757278-05:00", "2018-11-29T15:15:11.5634547-05:00", 
    "2019-01-07T16:21:30.1605-05:00", "2018-09-20T15:14:41.7556995-04:00", 
    "2018-09-20T12:15:10.554925-04:00", "2019-02-28T15:18:06.3375871-05:00", 
    "2019-01-30T16:19:36.2267236-05:00", "2019-01-11T16:22:14.3376133-05:00", 
    "2019-02-12T15:16:28.885949-05:00", "2019-01-22T15:14:42.0156433-05:00", 
    "2018-09-27T12:16:07.2590492-04:00", "2018-10-16T12:16:03.0998874-04:00", 
    "2018-09-18T12:16:10.87387-04:00", "2018-10-04T12:15:51.2675903-04:00", 
    "2019-01-14T16:23:10.624326-05:00", "2018-09-13T15:16:00.7292027-04:00", 
    "2018-11-15T15:16:21.0881453-05:00", "2018-10-30T12:15:01.1762795-04:00", 
    "2019-02-11T16:24:32.616885-05:00", "2018-11-27T15:18:54.639967-05:00", 
    "2018-10-30T15:15:57.3071154-04:00", "2018-11-15T12:16:33.4834433-05:00", 
    "2018-11-13T12:17:39.4884988-05:00", "2019-01-23T16:21:06.0884362-05:00", 
    "2019-02-04T16:22:04.0857079-05:00")), class = "data.frame", row.names = c(NA, 
-64L))

由于错误消息建议使用 difftime,您需要 class POSIXct 的值。在数据中,值是存储在列表中的字符类型。另一件需要注意的事情是数据包含来自不同时区(-04:00-05:00)的时间戳。值得庆幸的是,来自 lubridateymd_hms 可以自动为我们处理它。

library(dplyr)
library(lubridate)

recordingdata %>%
  mutate(across(c(start, end), unlist), 
         across(c(start, end), ymd_hms), 
         difference = difftime(end, start, units = 'mins'))

#                 start                 end    difference
#1  2018-10-04 18:00:12 2018-10-04 19:18:07 77.91420 mins
#2  2018-10-25 15:05:29 2018-10-25 16:16:14 70.73660 mins
#3  2018-10-11 15:04:04 2018-10-11 16:13:38 69.55299 mins
#4  2019-01-24 19:02:47 2019-01-24 20:16:50 74.04655 mins
#5  2019-01-16 20:31:36 2019-01-16 21:21:15 49.65784 mins
#6  2018-11-27 16:04:25 2018-11-27 17:16:36 72.17727 mins
#7  2018-09-27 17:59:20 2018-09-27 19:16:23 77.04634 mins
#8  2018-10-23 15:03:57 2018-10-23 16:18:01 74.06905 mins
#...
#...