使用 Purr 和 Map 从分组数据中提取信息

Using Purr and Map to extract information from grouped data

我有一个数据集,需要收集分组数据,比如最短时间,最长时间等

> data
# A tibble: 9 x 3
  DateTime            Location Temperature
  <dttm>              <chr>          <dbl>
1 2022-01-30 18:00:00 A               122 
2 2022-01-30 18:00:00 B               123 
3 2022-01-30 18:00:20 C               112 
4 2022-01-30 18:01:00 A               123 
5 2022-01-30 18:01:00 B               124 
6 2022-01-30 18:01:20 C               114 
7 2022-01-30 18:02:00 A               122.
8 2022-01-30 18:02:00 B               123 
9 2022-01-30 18:02:20 C               115 

我想要一个类似

的总结
Location   Min                       Max
A          2022-01-30 18:00:00       2022-01-30 18:02:00
B          2022-01-30 18:00:00       2022-01-30 18:02:00
C          2022-01-30 18:00:20       2022-01-30 18:00:20

我能够使用以下方法将它分成分组的小标题:

> data_grouped <- data %>%
+   split(.$Location)
> data
# A tibble: 9 x 3
  DateTime            Location Temperature
  <dttm>              <chr>          <dbl>
1 2022-01-30 18:00:00 A               122 
2 2022-01-30 18:00:00 B               123 
3 2022-01-30 18:00:20 C               112 
4 2022-01-30 18:01:00 A               123 
5 2022-01-30 18:01:00 B               124 
6 2022-01-30 18:01:20 C               114 
7 2022-01-30 18:02:00 A               122.
8 2022-01-30 18:02:00 B               123 
9 2022-01-30 18:02:20 C               115 
> data_grouped <- data %>%
+   split(.$Location)
> data_grouped
$A
# A tibble: 3 x 3
  DateTime            Location Temperature
  <dttm>              <chr>          <dbl>
1 2022-01-30 18:00:00 A               122 
2 2022-01-30 18:01:00 A               123 
3 2022-01-30 18:02:00 A               122.

$B
# A tibble: 3 x 3
  DateTime            Location Temperature
  <dttm>              <chr>          <dbl>
1 2022-01-30 18:00:00 B                123
2 2022-01-30 18:01:00 B                124
3 2022-01-30 18:02:00 B                123

$C
# A tibble: 3 x 3
  DateTime            Location Temperature
  <dttm>              <chr>          <dbl>
1 2022-01-30 18:00:20 C                112
2 2022-01-30 18:01:20 C                114
3 2022-01-30 18:02:20 C                115

但我无法进一步了解它。有人可以给我一些建议吗?数据的工作副本如下。

library(tidyverse)
library(lubridate)
library(purrr)


data <- tibble(
  DateTime = ymd_hms("2022-01-30 18:00:00",
                     "2022-01-30 18:00:00",
                     "2022-01-30 18:00:20",
                     "2022-01-30 18:01:00",
                     "2022-01-30 18:01:00",
                     "2022-01-30 18:01:20",
                     "2022-01-30 18:02:00",
                     "2022-01-30 18:02:00",
                     "2022-01-30 18:02:20"),
  Location = rep(c("A","B","C"),3),
  Temperature = c(122,123,112,123,124,114,122.5,123,115)
)

谢谢!

肖恩·韦

这可以通过 min/max 和组 by/summarise

来完成
library(dplyr)
data %>%
   group_by(Location) %>%
   summarise(Min = min(DateTime), Max = max(DateTime))

拆分为 list 然后循环并不是真正需要的。如果只是为了了解 map 的用法 - 使用 map 循环拆分 list,将 summarise 应用于 return [=12] =] 作为列并将输出列表元素 rbinded 与 _dfr

绑定
library(purrr)
map_dfr(data_grouped, ~ .x %>% 
  summarise(Location = first(Location), 
    Min = min(DateTime), Max = max(DateTime)))