计算 R 中两个日期之间的时间(以分钟为单位)(POSIXct 格式)
Calculating time in minutes between two dates in R (POSIXct format)
我想计算 R 中两个事件之间的时间(以分钟为单位),方法是仅针对特定 ID(例如 892530)以及事件 'eventBegins' 和第一个 human2 事件(或, 另一种写法是针对 human2) 的最小日期时间。请注意,'datetime' 变量采用 POSIXct 格式。我似乎无法使用 dplyr 和 base R (min()) 的混合来做到这一点,并且想为解决方案挑选你的大脑。最后,我试图获取数据集中每个 id 的平均时间差。
这是使用代码 dput(head(df, 30))
生成的输出
:
structure(list(visitor_id = c(175464, 175464, 175464, 892530,
892530, 892530, 892530, 892530, 892530, 1006916, 1006916, 1006916,
1336852, 1336852, 1336852, 2370624, 2370624, 2370624, 3347200,
3347200, 3347200, 4539320, 4539320, 4539320, 4539320, 4539320,
4666936, 4666936, 4666936, 4697670), event_type = c("human1", "human1",
"human2", "human1", "eventBegins", "human1", "human2", "human2", "eventEnds", "human1",
"human1", "human1", "human1", "human1", "human1", "human1", "human1", "human1", "human1",
"human1", "human1", "human1", "eventBegins", "human2", "human2", "human1", "human1",
"human1", "human1", "human1"), datetime = structure(c(1618678444,
1618678444, 1618678444, 1617980667, 1617980668, 1617980668, 1617980668,
1617980679, 1617980679, 1617530138, 1617530138, 1617530138, 1617299837,
1617299837, 1617299837, 1617621792, 1617621792, 1617621792, 1618145874,
1618145874, 1618145874, 1619013964, 1619013964, 1619013964, 1619014004,
1619014005, 1617282418, 1617282418, 1617282418, 1619543098), class = c("POSIXct",
"POSIXt"), tzone = "UTC")), row.names = c(NA, 30L), class = "data.frame")
如果您对如何解决此问题有任何想法,请告诉我 - 我 运行 没有想法。 TIA.
编辑: 使用来自 OP 的示例数据。
评论:您认为您可以使用以下内容吗?如果没有,请尝试(单手)编写具有预期结果的数据框。
library(tidyverse)
df %>% group_by(visitor_id) %>%
mutate(event_begins_per_id = min(datetime[event_type == "eventBegins"]),
time_diff = datetime - event_begins_per_id,
avg_time_diff_per_id = mean(time_diff[!event_type %in% c("eventBegins", "eventEnds")]))
结果
# A tibble: 30 x 6
# Groups: visitor_id [9]
visitor_id event_type datetime event_begins_per_id time_diff avg_time_diff_per_id
<dbl> <chr> <dttm> <dttm> <drtn> <drtn>
1 175464 human1 2021-04-17 16:54:04 NA NA -Inf secs -Inf secs
2 175464 human1 2021-04-17 16:54:04 NA NA -Inf secs -Inf secs
3 175464 human2 2021-04-17 16:54:04 NA NA -Inf secs -Inf secs
4 892530 human1 2021-04-09 15:04:27 2021-04-09 15:04:28 -1 secs 2.50 secs
5 892530 eventBegins 2021-04-09 15:04:28 2021-04-09 15:04:28 0 secs 2.50 secs
6 892530 human1 2021-04-09 15:04:28 2021-04-09 15:04:28 0 secs 2.50 secs
7 892530 human2 2021-04-09 15:04:28 2021-04-09 15:04:28 0 secs 2.50 secs
8 892530 human2 2021-04-09 15:04:39 2021-04-09 15:04:28 11 secs 2.50 secs
9 892530 eventEnds 2021-04-09 15:04:39 2021-04-09 15:04:28 11 secs 2.50 secs
10 1006916 human1 2021-04-04 09:55:38 NA NA -Inf secs -Inf secs
11 1006916 human1 2021-04-04 09:55:38 NA NA -Inf secs -Inf secs
12 1006916 human1 2021-04-04 09:55:38 NA NA -Inf secs -Inf secs
13 1336852 human1 2021-04-01 17:57:17 NA NA -Inf secs -Inf secs
14 1336852 human1 2021-04-01 17:57:17 NA NA -Inf secs -Inf secs
15 1336852 human1 2021-04-01 17:57:17 NA NA -Inf secs -Inf secs
16 2370624 human1 2021-04-05 11:23:12 NA NA -Inf secs -Inf secs
17 2370624 human1 2021-04-05 11:23:12 NA NA -Inf secs -Inf secs
18 2370624 human1 2021-04-05 11:23:12 NA NA -Inf secs -Inf secs
19 3347200 human1 2021-04-11 12:57:54 NA NA -Inf secs -Inf secs
20 3347200 human1 2021-04-11 12:57:54 NA NA -Inf secs -Inf secs
21 3347200 human1 2021-04-11 12:57:54 NA NA -Inf secs -Inf secs
22 4539320 human1 2021-04-21 14:06:04 2021-04-21 14:06:04 0 secs 20.25 secs
23 4539320 eventBegins 2021-04-21 14:06:04 2021-04-21 14:06:04 0 secs 20.25 secs
24 4539320 human2 2021-04-21 14:06:04 2021-04-21 14:06:04 0 secs 20.25 secs
25 4539320 human2 2021-04-21 14:06:44 2021-04-21 14:06:04 40 secs 20.25 secs
26 4539320 human1 2021-04-21 14:06:45 2021-04-21 14:06:04 41 secs 20.25 secs
27 4666936 human1 2021-04-01 13:06:58 NA NA -Inf secs -Inf secs
28 4666936 human1 2021-04-01 13:06:58 NA NA -Inf secs -Inf secs
29 4666936 human1 2021-04-01 13:06:58 NA NA -Inf secs -Inf secs
30 4697670 human1 2021-04-27 17:04:58 NA NA -Inf secs -Inf secs
我想计算 R 中两个事件之间的时间(以分钟为单位),方法是仅针对特定 ID(例如 892530)以及事件 'eventBegins' 和第一个 human2 事件(或, 另一种写法是针对 human2) 的最小日期时间。请注意,'datetime' 变量采用 POSIXct 格式。我似乎无法使用 dplyr 和 base R (min()) 的混合来做到这一点,并且想为解决方案挑选你的大脑。最后,我试图获取数据集中每个 id 的平均时间差。
这是使用代码 dput(head(df, 30))
生成的输出
:
structure(list(visitor_id = c(175464, 175464, 175464, 892530,
892530, 892530, 892530, 892530, 892530, 1006916, 1006916, 1006916,
1336852, 1336852, 1336852, 2370624, 2370624, 2370624, 3347200,
3347200, 3347200, 4539320, 4539320, 4539320, 4539320, 4539320,
4666936, 4666936, 4666936, 4697670), event_type = c("human1", "human1",
"human2", "human1", "eventBegins", "human1", "human2", "human2", "eventEnds", "human1",
"human1", "human1", "human1", "human1", "human1", "human1", "human1", "human1", "human1",
"human1", "human1", "human1", "eventBegins", "human2", "human2", "human1", "human1",
"human1", "human1", "human1"), datetime = structure(c(1618678444,
1618678444, 1618678444, 1617980667, 1617980668, 1617980668, 1617980668,
1617980679, 1617980679, 1617530138, 1617530138, 1617530138, 1617299837,
1617299837, 1617299837, 1617621792, 1617621792, 1617621792, 1618145874,
1618145874, 1618145874, 1619013964, 1619013964, 1619013964, 1619014004,
1619014005, 1617282418, 1617282418, 1617282418, 1619543098), class = c("POSIXct",
"POSIXt"), tzone = "UTC")), row.names = c(NA, 30L), class = "data.frame")
如果您对如何解决此问题有任何想法,请告诉我 - 我 运行 没有想法。 TIA.
编辑: 使用来自 OP 的示例数据。
评论:您认为您可以使用以下内容吗?如果没有,请尝试(单手)编写具有预期结果的数据框。
library(tidyverse)
df %>% group_by(visitor_id) %>%
mutate(event_begins_per_id = min(datetime[event_type == "eventBegins"]),
time_diff = datetime - event_begins_per_id,
avg_time_diff_per_id = mean(time_diff[!event_type %in% c("eventBegins", "eventEnds")]))
结果
# A tibble: 30 x 6
# Groups: visitor_id [9]
visitor_id event_type datetime event_begins_per_id time_diff avg_time_diff_per_id
<dbl> <chr> <dttm> <dttm> <drtn> <drtn>
1 175464 human1 2021-04-17 16:54:04 NA NA -Inf secs -Inf secs
2 175464 human1 2021-04-17 16:54:04 NA NA -Inf secs -Inf secs
3 175464 human2 2021-04-17 16:54:04 NA NA -Inf secs -Inf secs
4 892530 human1 2021-04-09 15:04:27 2021-04-09 15:04:28 -1 secs 2.50 secs
5 892530 eventBegins 2021-04-09 15:04:28 2021-04-09 15:04:28 0 secs 2.50 secs
6 892530 human1 2021-04-09 15:04:28 2021-04-09 15:04:28 0 secs 2.50 secs
7 892530 human2 2021-04-09 15:04:28 2021-04-09 15:04:28 0 secs 2.50 secs
8 892530 human2 2021-04-09 15:04:39 2021-04-09 15:04:28 11 secs 2.50 secs
9 892530 eventEnds 2021-04-09 15:04:39 2021-04-09 15:04:28 11 secs 2.50 secs
10 1006916 human1 2021-04-04 09:55:38 NA NA -Inf secs -Inf secs
11 1006916 human1 2021-04-04 09:55:38 NA NA -Inf secs -Inf secs
12 1006916 human1 2021-04-04 09:55:38 NA NA -Inf secs -Inf secs
13 1336852 human1 2021-04-01 17:57:17 NA NA -Inf secs -Inf secs
14 1336852 human1 2021-04-01 17:57:17 NA NA -Inf secs -Inf secs
15 1336852 human1 2021-04-01 17:57:17 NA NA -Inf secs -Inf secs
16 2370624 human1 2021-04-05 11:23:12 NA NA -Inf secs -Inf secs
17 2370624 human1 2021-04-05 11:23:12 NA NA -Inf secs -Inf secs
18 2370624 human1 2021-04-05 11:23:12 NA NA -Inf secs -Inf secs
19 3347200 human1 2021-04-11 12:57:54 NA NA -Inf secs -Inf secs
20 3347200 human1 2021-04-11 12:57:54 NA NA -Inf secs -Inf secs
21 3347200 human1 2021-04-11 12:57:54 NA NA -Inf secs -Inf secs
22 4539320 human1 2021-04-21 14:06:04 2021-04-21 14:06:04 0 secs 20.25 secs
23 4539320 eventBegins 2021-04-21 14:06:04 2021-04-21 14:06:04 0 secs 20.25 secs
24 4539320 human2 2021-04-21 14:06:04 2021-04-21 14:06:04 0 secs 20.25 secs
25 4539320 human2 2021-04-21 14:06:44 2021-04-21 14:06:04 40 secs 20.25 secs
26 4539320 human1 2021-04-21 14:06:45 2021-04-21 14:06:04 41 secs 20.25 secs
27 4666936 human1 2021-04-01 13:06:58 NA NA -Inf secs -Inf secs
28 4666936 human1 2021-04-01 13:06:58 NA NA -Inf secs -Inf secs
29 4666936 human1 2021-04-01 13:06:58 NA NA -Inf secs -Inf secs
30 4697670 human1 2021-04-27 17:04:58 NA NA -Inf secs -Inf secs