如何在 R 中压缩带有日期的数据集

How to condense a dataset with dates in R

我正在尝试获取一些数据并清理它以供最终用户查看,但我是 R 的新手,似乎不太清楚如何去做。另外,这是我的第一个 post,所以如果我写这个问题的方式有任何格式或结构问题,请告诉我。

目前的数据情况:

name date reason
john 1/1/2022 late
john 1/2/2022 late
john 1/4/2022 absent
betty 1/3/2022 absent
betty 1/5/2022 no call
betty 1/7/2022 no call
kyle 1/3/2022 absent
kyle 1/5/2022 no call
kyle 1/7/2022 no call

我想看看是否有一种方法可以将其压缩,以便对于每个名字,日期和原因都在同一行上。像这样:

name date1 reason1 date2 reason2 date3 reason3
john 1/1/2022 late 1/2/2022 late 1/4/2022 absent
betty 1/3/2022 absent 1/5/2022 no call 1/7/2022 no call
kyle 1/3/2022 absent 1/5/2022 no call 1/7/2022 no call

或者,我尝试使用 dcast,但我的代码生成的是数字而不是日期。

new db <- dcast(db, name ~ reason, fun.aggregate = list, value.var = "date")

我想要的:

name late absent no call
john 1/1/2022,1/2/2022 1/4/2022
betty 1/3/2022 1/5/2022,1/7/2022
kyle 1/3/2022 1/5/2022,1/7/2022

我得到的:

name late absent no call
john c(1620708300,1627236300) 1639328820 numeric(0)
betty numeric(0) 1612973940 c(1611937080, 1612455480)
kyle numeric(0) 1639329540 c(1635526800, 1639760400)

编辑:

由于 @Andre Wildberg,我能够使用 as.data.frame(pivot_wider(df, names_from=reason, values_from=date, values_fn=list, values_fill=list(""))) 使它离我需要的位置几英寸远,我需要的最后一步是从单元格中删除 c()并能够在这些字段中显示干净的日期。

db<-structure(list(name = c("Debby", "Debby", "Debby", 
"Debby", "Robert", "Robert", "Robert", 
"Ryan", "Ryan", "Ryan", "Ryan", 
"Ryan", "Ryan", "Brandon", "Brandon"
), reason = c("Absent", "Leave Early", "Late", "Leave Early", 
"Leave Early", "Leave Early", "Absent", "Absent", "Absent", "Absent", 
"Absent", "Leave Early", "Late", "Leave Early", "Leave Early"
), date = c("2021-05-11 04:45:00", "2021-05-15 04:02:00", "2021-07-25                     
18:05:00", 
"2021-09-19 20:01:00", "2021-11-25 01:02:00", "2021-12-08 20:56:00", 
"2021-12-16 17:30:00", "2021-10-09 17:00:00", "2021-11-07 17:00:00", 
"2021-11-12 17:00:00", "2021-11-28 17:00:00", "2021-12-11 01:31:00", 
"2021-12-12 17:07:00", "2021-05-03 23:58:00", "2021-05-15 23:31:00"
)), row.names = c(NA, -15L), class = c("tbl_df", "tbl", "data.frame"
))

如果你想合并观察结果,试试这个

library(tidyr)

as.data.frame(pivot_wider(df, names_from=reason, values_from=date, 
  values_fn=list, values_fill=list("")))
     name
1   Debby
2  Robert
3    Ryan
4 Brandon
                                                                              Absent
1                                                                2021-05-11 04:45:00
2                                                                2021-12-16 17:30:00
3 2021-10-09 17:00:00, 2021-11-07 17:00:00, 2021-11-12 17:00:00, 2021-11-28 17:00:00
4                                                                                   
                               Leave Early                Late
1 2021-05-15 04:02:00, 2021-09-19 20:01:00 2021-07-25 18:05:00
2 2021-11-25 01:02:00, 2021-12-08 20:56:00                    
3                      2021-12-11 01:31:00 2021-12-12 17:07:00
4 2021-05-03 23:58:00, 2021-05-15 23:31:00 

数据

df <- structure(list(name = c("Debby", "Debby", "Debby", "Debby", "Robert", 
"Robert", "Robert", "Ryan", "Ryan", "Ryan", "Ryan", "Ryan", "Ryan", 
"Brandon", "Brandon"), reason = c("Absent", "Leave Early", "Late", 
"Leave Early", "Leave Early", "Leave Early", "Absent", "Absent", 
"Absent", "Absent", "Absent", "Leave Early", "Late", "Leave Early", 
"Leave Early"), date = c("2021-05-11 04:45:00", "2021-05-15 04:02:00", 
"2021-07-25 18:05:00", "2021-09-19 20:01:00", "2021-11-25 01:02:00", 
"2021-12-08 20:56:00", "2021-12-16 17:30:00", "2021-10-09 17:00:00", 
"2021-11-07 17:00:00", "2021-11-12 17:00:00", "2021-11-28 17:00:00", 
"2021-12-11 01:31:00", "2021-12-12 17:07:00", "2021-05-03 23:58:00", 
"2021-05-15 23:31:00")), row.names = c(NA, -15L), class = c("tbl_df", 
"tbl", "data.frame"))