使用日期格式错误的日期数据- R

Working with date data that has badly formatted dates- R

所以我有以下包含 1 列日期的数据框。

date <- structure(list(Date = c("09/09/202109/09", "09/12/202109/12", 
"10/12/202110/12", "11/12/202111/12", "01/12/202201/12", "08/12/202108/12"
)), row.names = c(NA, 6L), class = "data.frame")

> print(date)
             Date
1 09/09/202109/09
2 09/12/202109/12
3 10/12/202110/12
4 11/12/202111/12
5 01/12/202201/12
6 08/12/202108/12

对于第 1 行 (09/09/202109/09) - 日期是 09/09/2021。 我最初的计划是只使用 case_when/mutate 并更改每个日期,但我想看看是否有更快的方法。

是否可以去除列中每行的最后 4 个字符?

我想要的输出是这样的


> print(date)
             Date
1 09/09/2021
2 09/12/2021
3 10/12/2021
4 11/12/2021
5 01/12/2022
6 08/12/2021
>

我假设你的日期是 month/day/year。但是,如果它们是 day/month/year,只需将 "%m/%d/%Y" 更改为 "%d/%m/%Y"

date <- structure(list(Date = c("09/09/202109/09", "09/12/202109/12", 
"10/12/202110/12", "11/12/202111/12", "01/12/202201/12", "08/12/202108/12"
)), row.names = c(NA, 6L), class = "data.frame")

clean_date_strings  <- substr(date$Date, 1, nchar(date$Date)-4)
as.Date(clean_date_strings, format = "%m/%d/%Y")

输出:

r$> as.Date(clean_date_strings, format = "%m/%d/%Y")
[1] "2021-09-09" "2021-09-12" "2021-10-12" "2021-11-12" "2022-01-12" "2021-08-12"