将字符列转换为 R 中混合数字和日期的日期

Convert character column to date with mixed numbers and date in R

我必须结合来自 excel 的大量分析师电子表格,似乎每周他们都会找到一种激怒我的新方法。这是我的最新问题:

data <- tibble(date_column = c(44673, 44674, "2022-04-25"))

# A tibble: 3 x 1
  date_column
  <chr>      
1 44673      
2 44674      
3 2022-04-25 

我尝试了多种方法来测试它是否可以是数字,然后以一种方式将其转换为日期,如果不是,则以另一种方式转换,如下所示:

library(tidyverse)
library(lubridate)

data %>% 
  mutate(date_column = case_when(
    !is.na(as.numeric(date_column)) ~ as.Date(date_column, origin = "1899-12-30"),
    is.na(as.numeric(date_column)) ~ parse_date_time(date_column, orders = c("mdY","Ymd T", "Ymd"))))

# which throws a 

Error: Problem with `mutate()` column `date_column`.
i `date_column = case_when(...)`.
x character string is not in a standard unambiguous format
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
Problem with `mutate()` column `date_column`.
i `date_column = case_when(...)`.
i NAs introduced by coercion 

我觉得这一定是一个很常见的问题,以前已经解决了。非常感谢帮助。

可能的解决方案,基于openxlsx::convertToDate

library(tidyverse)
library(openxlsx)

data <- tibble(date_column = c(44673, 44674, "2022-04-25"))

coalesce(suppressWarnings(convertToDate(data$date_column)) %>% 
   as.character(), data$date_column)

#> [1] "2022-04-22" "2022-04-23" "2022-04-25"

试试看门人 convert-to-date:

library(tidyverse)
library(janitor)
#> 
#> Attaching package: 'janitor'
#> The following objects are masked from 'package:stats':
#> 
#>     chisq.test, fisher.test

tribble(~date,
        "44673",
        "44674",
        "2022-04-25") |> 
  mutate(clean_date = convert_to_date(date))
#> # A tibble: 3 × 2
#>   date       clean_date
#>   <chr>      <date>    
#> 1 44673      2022-04-22
#> 2 44674      2022-04-23
#> 3 2022-04-25 2022-04-25

reprex package (v2.0.1)

于 2022-05-05 创建