`as.POSIXct` 使用 `"%Y-%m-%d %H:%M:%S"` 格式出错

`as.POSIXct` get error with `"%Y-%m-%d %H:%M:%S"` format

dates <- seq(1626629937,1626629944)

# CORRECT

## #1
as.POSIXct(dates,                    tz="Asia/Shanghai",origin="1970-01-01")
#> [1] "2021-07-19 01:38:57 CST" "2021-07-19 01:38:58 CST" "2021-07-19 01:38:59 CST" "2021-07-19 01:39:00 CST"
#> [5] "2021-07-19 01:39:01 CST" "2021-07-19 01:39:02 CST" "2021-07-19 01:39:03 CST" "2021-07-19 01:39:04 CST"

## #2
as.POSIXct(dates,                    tz="Asia/Shanghai",origin="1970-01-01",optional = FALSE)
#> [1] "2021-07-19 01:38:57 CST" "2021-07-19 01:38:58 CST" "2021-07-19 01:38:59 CST" "2021-07-19 01:39:00 CST"
#> [5] "2021-07-19 01:39:01 CST" "2021-07-19 01:39:02 CST" "2021-07-19 01:39:03 CST" "2021-07-19 01:39:04 CST"


# DIFFERENT RESULT

## #3
as.POSIXct(dates,"%Y-%m-%d %H:%M:%S"                   ,origin="1970-01-01")
#> [1] "2021-07-18 17:38:57" "2021-07-18 17:38:58" "2021-07-18 17:38:59" "2021-07-18 17:39:00" "2021-07-18 17:39:01"
#> [6] "2021-07-18 17:39:02" "2021-07-18 17:39:03" "2021-07-18 17:39:04"


# NAs

## #4
as.POSIXct(dates,"%Y-%m-%d %H:%M:%S",tz="Asia/Shanghai",origin="1970-01-01")
#> [1] NA NA NA NA NA NA NA NA

## #5
as.POSIXct(dates,"%Y-%m-%d %H:%M:%S",tz="Asia/Shanghai",origin="1970-01-01",optional = FALSE)
#> [1] NA NA NA NA NA NA NA NA


# ERROR

## #6
as.POSIXct(dates,"%Y-%m-%d %H:%M:%S"                                       ,optional = FALSE)
#>  Error in as.POSIXct.numeric(as.integer(.), "%Y-%m-%d %H:%M:%S", optional = FALSE) : 
#>   'origin' must be supplied 

作为上述 R 脚本的输出,使用 tzoriginoptional 参数格式化“%Y-%m-%d %H:%M:%S”因为 NA.

问题出在哪里?

先简单的:

  • optional = FALSE 是默认值:因此 #1 == #2 和 #4 == #5
  • #6 无需解释:您需要参数 origin = 作为错误状态
  • #3 returns 由于时区不同的结果(tz= 参数)。所以显示8小时前。

现在,问题是#4 和#5(与我之前所说的相同):

as.POSIXct(dates,"%Y-%m-%d %H:%M:%S",tz="Asia/Shanghai",origin="1970-01-01")
#> [1] NA NA NA NA NA NA NA NA

要了解这是如何工作的,您需要查看函数 as.POSIXct,当用数字 x 调用时(如本例),它会调用方法:as.POSIXct.numeric.

as.POSIXct.numeric

#> function (x, tz = "", origin, ...) 
#> {
#>     if (missing(origin)) {
#>         if (!length(x)) 
#>             return(.POSIXct(numeric(), tz))
#>         if (!any(is.finite(x))) 
#>             return(.POSIXct(x, tz))
#>         stop("'origin' must be supplied")
#>     }
#>     .POSIXct(as.POSIXct(origin, tz = "GMT", ...) + x, tz)
#> }
#> <bytecode: 0x55df7f23b390>
#> <environment: namespace:base>

关注这一行:

#> .POSIXct(as.POSIXct(origin, tz = "GMT", ...) + x, tz)

特别是:

as.POSIXct(origin, tz = "GMT", ...) + x

如您所见,该函数在日期时间中转换 origin,然后对您估算的数字输入求和。您提供的每个附加参数都属于 ....

该函数尝试使用您提供的格式将 1970-01-01 转换为日期时间:%Y-%m-%d %H:%M:%S。 由于 origin 1970-01-01 的格式为 %Y-%m-%d,函数无法将 origin 从字符串转换为 POSIX,因此返回 NA。 (这就是生成 NA 的地方!)

当您将数字转换为 POSIX 时,您作为参数添加的格式既不适用于输出(因为它始终是 POSIX),也不适用于输入,而是origin。因此,originformat 需要匹配。

要解决您的问题,您需要使用 origin 格式 %Y-%m-%d %H:%M:%S。 像这样:

as.POSIXct(dates,"%Y-%m-%d %H:%M:%S",tz="Asia/Shanghai",origin="1970-01-01 00:00:00")
#> [1] "2021-07-19 01:38:57 CST" "2021-07-19 01:38:58 CST" "2021-07-19 01:38:59 CST" "2021-07-19 01:39:00 CST"
#> [5] "2021-07-19 01:39:01 CST" "2021-07-19 01:39:02 CST" "2021-07-19 01:39:03 CST" "2021-07-19 01:39:04 CST"

或者您需要使用这种格式:%Y-%m-%d 像这样:

as.POSIXct(dates,"%Y-%m-%d",tz="Asia/Shanghai",origin="1970-01-01")
#> [1] "2021-07-19 01:38:57 CST" "2021-07-19 01:38:58 CST" "2021-07-19 01:38:59 CST" "2021-07-19 01:39:00 CST"
#> [5] "2021-07-19 01:39:01 CST" "2021-07-19 01:39:02 CST" "2021-07-19 01:39:03 CST" "2021-07-19 01:39:04 CST"

结果等于#1 和#2。