从因素到另一种格式的日期，以便我可以找到每个组的开始和结束日期

Question

我是 R 的新手，我不知道我的问题。我经历过无数种形式。

我的数据集如下所示： Glimpse of dataset

我想找到每个事件的第一个和最后一个日期，并把它放在一个漂亮的 table 中。有26个事件。但是，日期采用因子格式，这让我无法找到开始和结束日期。当我尝试将它转换为数字格式时，我得到每个值的 NA，当我尝试将它转换为日期格式时，它保持因子格式。

有人可以帮我吗？

按照建议，我尝试找到一种使用 dput 共享我的数据集的方法。我试过了，我认为这应该可以获取我的数据集的 2x8 样本。

df <- structure(list(`Release Date` = structure(c(1L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), .Names = c("", 
                                                            "", "", "", "", 
                                                            "", "", ""), .Label = c("3/17/2020", "Release Date", "6/16/2020", "9/15/2020", 
                                                                            "12/16/2020", "12/17/2015", "6/17/2013", "9/17/2012", "6/14/2012", 
                                                                            "3/15/2012", "6/20/2011", "3/16/2011", "12/16/2010", "9/14/2010", 
                                                                            "6/16/2010", "3/17/2010", "12/15/2009", "9/15/2009", "6/16/2009", 
                                                                            "3/13/2009", "12/12/2008", "9/15/2008", "6/13/2008", "3/14/2008", 
                                                                            "12/13/2007", "9/12/2007", "6/14/2007", "3/15/2007", "12/14/2006", 
                                                                            "9/14/2006", "6/16/2006", "3/17/2006", "12/15/2005", "10/18/2005", 
                                                                            "9/21/2005", "7/15/2005", "6/21/2005", "4/15/2005", "3/15/2005", 
                                                                            "1/18/2005", "12/15/2004", "10/27/2004", "9/15/2004", "7/28/2004" 
                                                                            ), class = "factor"), Event = structure(c(2L, 
                                                                                                                                                                               2L, 2L, 2L, 3L, 3L, 3L, 3L), .Names = c("", "", "", "", "", "", "", ""), .Label = c("Event", "Labour Costs YoY", 
                                                                                                                                                                                                                                   "Unemployment Change (000's)", "Unemployment Rate", "Jobseekers Net Change"
                                                                                                                                                                               ), class = "factor")), row.names = c("X.1", "X.11",  "X.12", "X.13", "X.14", "X.15", "X.16", "X.17"), class = "data.frame")

Answer 1

将日期转换为日期对象后，您可以使用 min 和 max 获取每个事件的第一个和最后一个日期。

library(dplyr)

df %>%
  mutate(`Release Date` = as.Date(`Release Date`, '%m/%d/%Y')) %>%
  group_by(Event) %>%
  summarise(first_date = min(`Release Date`), 
            last_date = max(`Release Date`))

#  Event                       first_date last_date 
#  <fct>                       <date>     <date>    
#1 Labour Costs YoY            2020-03-17 2020-12-16
#2 Unemployment Change (000's) 2012-06-14 2015-12-17

从因素到另一种格式的日期，以便我可以找到每个组的开始和结束日期

Dates from factor to another format so I can find start and end dates per group

format

r

date

factors