如何在 R 中压缩带有日期的数据集

Question

我正在尝试获取一些数据并清理它以供最终用户查看，但我是 R 的新手，似乎不太清楚如何去做。另外，这是我的第一个 post，所以如果我写这个问题的方式有任何格式或结构问题，请告诉我。

目前的数据情况：

name	date	reason
john	1/1/2022	late
john	1/2/2022	late
john	1/4/2022	absent
betty	1/3/2022	absent
betty	1/5/2022	no call
betty	1/7/2022	no call
kyle	1/3/2022	absent
kyle	1/5/2022	no call
kyle	1/7/2022	no call

我想看看是否有一种方法可以将其压缩，以便对于每个名字，日期和原因都在同一行上。像这样：

name	date1	reason1	date2	reason2	date3	reason3
john	1/1/2022	late	1/2/2022	late	1/4/2022	absent
betty	1/3/2022	absent	1/5/2022	no call	1/7/2022	no call
kyle	1/3/2022	absent	1/5/2022	no call	1/7/2022	no call

或者，我尝试使用 dcast，但我的代码生成的是数字而不是日期。

new db <- dcast(db, name ~ reason, fun.aggregate = list, value.var = "date")

我想要的：

name	late	absent	no call
john	1/1/2022,1/2/2022	1/4/2022
betty		1/3/2022	1/5/2022,1/7/2022
kyle		1/3/2022	1/5/2022,1/7/2022

我得到的：

name	late	absent	no call
john	c(1620708300,1627236300)	1639328820	numeric(0)
betty	numeric(0)	1612973940	c(1611937080, 1612455480)
kyle	numeric(0)	1639329540	c(1635526800, 1639760400)

编辑：

由于 @Andre Wildberg，我能够使用 as.data.frame(pivot_wider(df, names_from=reason, values_from=date, values_fn=list, values_fill=list(""))) 使它离我需要的位置几英寸远，我需要的最后一步是从单元格中删除 c()并能够在这些字段中显示干净的日期。

db<-structure(list(name = c("Debby", "Debby", "Debby", 
"Debby", "Robert", "Robert", "Robert", 
"Ryan", "Ryan", "Ryan", "Ryan", 
"Ryan", "Ryan", "Brandon", "Brandon"
), reason = c("Absent", "Leave Early", "Late", "Leave Early", 
"Leave Early", "Leave Early", "Absent", "Absent", "Absent", "Absent", 
"Absent", "Leave Early", "Late", "Leave Early", "Leave Early"
), date = c("2021-05-11 04:45:00", "2021-05-15 04:02:00", "2021-07-25                     
18:05:00", 
"2021-09-19 20:01:00", "2021-11-25 01:02:00", "2021-12-08 20:56:00", 
"2021-12-16 17:30:00", "2021-10-09 17:00:00", "2021-11-07 17:00:00", 
"2021-11-12 17:00:00", "2021-11-28 17:00:00", "2021-12-11 01:31:00", 
"2021-12-12 17:07:00", "2021-05-03 23:58:00", "2021-05-15 23:31:00"
)), row.names = c(NA, -15L), class = c("tbl_df", "tbl", "data.frame"
))

Answer 1

如果你想合并观察结果，试试这个

library(tidyr)

as.data.frame(pivot_wider(df, names_from=reason, values_from=date, 
  values_fn=list, values_fill=list("")))
     name
1   Debby
2  Robert
3    Ryan
4 Brandon
                                                                              Absent
1                                                                2021-05-11 04:45:00
2                                                                2021-12-16 17:30:00
3 2021-10-09 17:00:00, 2021-11-07 17:00:00, 2021-11-12 17:00:00, 2021-11-28 17:00:00
4                                                                                   
                               Leave Early                Late
1 2021-05-15 04:02:00, 2021-09-19 20:01:00 2021-07-25 18:05:00
2 2021-11-25 01:02:00, 2021-12-08 20:56:00                    
3                      2021-12-11 01:31:00 2021-12-12 17:07:00
4 2021-05-03 23:58:00, 2021-05-15 23:31:00

数据

df <- structure(list(name = c("Debby", "Debby", "Debby", "Debby", "Robert", 
"Robert", "Robert", "Ryan", "Ryan", "Ryan", "Ryan", "Ryan", "Ryan", 
"Brandon", "Brandon"), reason = c("Absent", "Leave Early", "Late", 
"Leave Early", "Leave Early", "Leave Early", "Absent", "Absent", 
"Absent", "Absent", "Absent", "Leave Early", "Late", "Leave Early", 
"Leave Early"), date = c("2021-05-11 04:45:00", "2021-05-15 04:02:00", 
"2021-07-25 18:05:00", "2021-09-19 20:01:00", "2021-11-25 01:02:00", 
"2021-12-08 20:56:00", "2021-12-16 17:30:00", "2021-10-09 17:00:00", 
"2021-11-07 17:00:00", "2021-11-12 17:00:00", "2021-11-28 17:00:00", 
"2021-12-11 01:31:00", "2021-12-12 17:07:00", "2021-05-03 23:58:00", 
"2021-05-15 23:31:00")), row.names = c(NA, -15L), class = c("tbl_df", 
"tbl", "data.frame"))

如何在 R 中压缩带有日期的数据集

How to condense a dataset with dates in R

database

r

spreadsheet

数据