任何时候只为 10 号之前的日期生成 NA 日期
anytime producing NA dates only for dates before the 10th
我正在尝试从 CSV 文件中读取日期。示例:
Date User 1 User 2
8/1/2019 IN IN
8/2/2019 IN Out
8/3/2019 IN IN
8/4/2019 IN IN
8/5/2019 IN IN
8/6/2019 IN IN
8/7/2019 IN IN
8/8/2019 IN IN
8/9/2019 IN IN
8/10/2019 IN IN
8/11/2019 IN IN
我想我想出了一个正确读取这些日期的好方法,它是:
Vacation <- read.csv("Vacation.csv", stringsAsFactors = FALSE)
Vacation$Date <- anydate(Vacation$Date)
但是,由于某些原因,一旦我转换为日期,只有 10 号之前的日期是 NA。
[1] NA NA NA NA NA NA
[7] NA NA NA "2019-08-10" "2019-08-11" "2019-08-12"
[13] "2019-08-13" "2019-08-14" "2019-08-15" "2019-08-16" "2019-08-17" "2019-08-18"
[19] "2019-08-19" "2019-08-20" "2019-08-21" "2019-08-22" "2019-08-23" "2019-08-24"
[25] "2019-08-25" "2019-08-26" "2019-08-27" "2019-08-28" "2019-08-29" "2019-08-30"
[31] "2019-08-31" NA NA NA NA NA
[37] NA NA NA NA "2019-09-10" "2019-09-11"
[43] "2019-09-12" "2019-09-13" "2019-09-14" "2019-09-15" "2019-09-16" "2019-09-17"
[49] "2019-09-18" "2019-09-19" "2019-09-20" "2019-09-21" "2019-09-22" "2019-09-23"
[55] "2019-09-24" "2019-09-25" "2019-09-26" "2019-09-27" "2019-09-28" "2019-09-29"
基础 R
as.Date(strptime(d$Date, "%m/%d/%Y"))
或
lubridate::mdy(d$Date)
#[1] "2019-08-01" "2019-08-02" "2019-08-03" "2019-08-04" "2019-08-05" "2019-08-06" "2019-08-07"
#[8] "2019-08-08" "2019-08-09" "2019-08-10" "2019-08-11"
值得注意的是,根据 tidyverse read_csv()
,根据我的经验,它的读取速度比 read.csv()
快,并且可以容纳更多非理想的日期格式。
library(tidyverse)
Vacation <- read_csv("Vacation.csv", stringsAsFactors = FALSE) %>%
mutate(Date = mdy(Date))
这将在下一个版本中修复,因为我(迟迟)意识到 Boost date_time 库有一个不同的格式参数,我正在添加:
示例:
R> library(anytime) # unreleased version, on CRAN in a few weeks
R> inp <- gsub("-0", "-", format(anydate(20190801) + 0:12))
R> inp ## note the single digits
[1] "2019-8-1" "2019-8-2" "2019-8-3" "2019-8-4" "2019-8-5" "2019-8-6"
[7] "2019-8-7" "2019-8-8" "2019-8-9" "2019-8-10" "2019-8-11" "2019-8-12"
[13] "2019-8-13"
R>
R> anytime(inp)
[1] "2019-08-01 CDT" "2019-08-02 CDT" "2019-08-03 CDT" "2019-08-04 CDT"
[5] "2019-08-05 CDT" "2019-08-06 CDT" "2019-08-07 CDT" "2019-08-08 CDT"
[9] "2019-08-09 CDT" "2019-08-10 CDT" "2019-08-11 CDT" "2019-08-12 CDT"
[13] "2019-08-13 CDT"
R>
R> anydate(inp)
[1] "2019-08-01" "2019-08-02" "2019-08-03" "2019-08-04" "2019-08-05" "2019-08-06"
[7] "2019-08-07" "2019-08-08" "2019-08-09" "2019-08-10" "2019-08-11" "2019-08-12"
[13] "2019-08-13"
R>
我正在尝试从 CSV 文件中读取日期。示例:
Date User 1 User 2
8/1/2019 IN IN
8/2/2019 IN Out
8/3/2019 IN IN
8/4/2019 IN IN
8/5/2019 IN IN
8/6/2019 IN IN
8/7/2019 IN IN
8/8/2019 IN IN
8/9/2019 IN IN
8/10/2019 IN IN
8/11/2019 IN IN
我想我想出了一个正确读取这些日期的好方法,它是:
Vacation <- read.csv("Vacation.csv", stringsAsFactors = FALSE)
Vacation$Date <- anydate(Vacation$Date)
但是,由于某些原因,一旦我转换为日期,只有 10 号之前的日期是 NA。
[1] NA NA NA NA NA NA
[7] NA NA NA "2019-08-10" "2019-08-11" "2019-08-12"
[13] "2019-08-13" "2019-08-14" "2019-08-15" "2019-08-16" "2019-08-17" "2019-08-18"
[19] "2019-08-19" "2019-08-20" "2019-08-21" "2019-08-22" "2019-08-23" "2019-08-24"
[25] "2019-08-25" "2019-08-26" "2019-08-27" "2019-08-28" "2019-08-29" "2019-08-30"
[31] "2019-08-31" NA NA NA NA NA
[37] NA NA NA NA "2019-09-10" "2019-09-11"
[43] "2019-09-12" "2019-09-13" "2019-09-14" "2019-09-15" "2019-09-16" "2019-09-17"
[49] "2019-09-18" "2019-09-19" "2019-09-20" "2019-09-21" "2019-09-22" "2019-09-23"
[55] "2019-09-24" "2019-09-25" "2019-09-26" "2019-09-27" "2019-09-28" "2019-09-29"
基础 R
as.Date(strptime(d$Date, "%m/%d/%Y"))
或
lubridate::mdy(d$Date)
#[1] "2019-08-01" "2019-08-02" "2019-08-03" "2019-08-04" "2019-08-05" "2019-08-06" "2019-08-07"
#[8] "2019-08-08" "2019-08-09" "2019-08-10" "2019-08-11"
值得注意的是,根据 tidyverse read_csv()
,根据我的经验,它的读取速度比 read.csv()
快,并且可以容纳更多非理想的日期格式。
library(tidyverse)
Vacation <- read_csv("Vacation.csv", stringsAsFactors = FALSE) %>%
mutate(Date = mdy(Date))
这将在下一个版本中修复,因为我(迟迟)意识到 Boost date_time 库有一个不同的格式参数,我正在添加:
示例:
R> library(anytime) # unreleased version, on CRAN in a few weeks
R> inp <- gsub("-0", "-", format(anydate(20190801) + 0:12))
R> inp ## note the single digits
[1] "2019-8-1" "2019-8-2" "2019-8-3" "2019-8-4" "2019-8-5" "2019-8-6"
[7] "2019-8-7" "2019-8-8" "2019-8-9" "2019-8-10" "2019-8-11" "2019-8-12"
[13] "2019-8-13"
R>
R> anytime(inp)
[1] "2019-08-01 CDT" "2019-08-02 CDT" "2019-08-03 CDT" "2019-08-04 CDT"
[5] "2019-08-05 CDT" "2019-08-06 CDT" "2019-08-07 CDT" "2019-08-08 CDT"
[9] "2019-08-09 CDT" "2019-08-10 CDT" "2019-08-11 CDT" "2019-08-12 CDT"
[13] "2019-08-13 CDT"
R>
R> anydate(inp)
[1] "2019-08-01" "2019-08-02" "2019-08-03" "2019-08-04" "2019-08-05" "2019-08-06"
[7] "2019-08-07" "2019-08-08" "2019-08-09" "2019-08-10" "2019-08-11" "2019-08-12"
[13] "2019-08-13"
R>