POSIXct 和时区转换

POSIXct and timezone conversion

再一次,我为 R 中的日期和时间而苦恼。 我有这样的数据集:

> dput(df)
structure(list(event.time.utc = structure(c(1471407324, 1489129025, 
1480714809, 1471111613, 1472965336, 1484421419, 1475607466, 1475476528, 
1473041225, 1487378311), class = c("POSIXct", "POSIXt"), tzone = ""), 
    time.verify = c("15:15:24", "16:57:05", "07:40:09", "05:06:53", 
    "16:02:16", "05:16:59", "05:57:46", "17:35:28", "13:07:05", 
    "10:38:31")), class = "data.frame", .Names = c("event.time.utc", 
"time.verify"), row.names = c(NA, -10L))

> str(df)
'data.frame':   10 obs. of  2 variables:
 $ event.time.utc: POSIXct, format: "2016-08-16 22:15:24" "2017-03-09 23:57:05" "2016-12-02 14:40:09" "2016-08-13 12:06:53" ...
 $ time.verify   : chr  "15:15:24" "16:57:05" "07:40:09" "05:06:53" ...

我想将 event.time.utc 转换为我当地的时区:America/Edmonton (UTC-07:00)。变量time.verify只是为了验证结果。首先,我确认我的时区存在:

> tzfile <- "F:/PortableApps/R/R-3.4.0/share/zoneinfo/zone.tab"
> tzones <- read.delim(tzfile, row.names = NULL, header = FALSE,
+                      col.names = c("country", "coords", "name", "comments"),
+                      as.is = TRUE, fill = TRUE, comment.char = "#")

> tzones %>% filter(str_detect(name, "Edmonton"))
  country      coords             name                      comments
1      CA +5333-11328 America/Edmonton Mountain - AB; BC (E); SK (W)

我还验证了我的默认时区:

> Sys.timezone()
[1] "America/Edmonton"

对我的 OS:

做同样的事情
C:\Users\tspeidel>tzutil /g
Mountain Standard Time

好的,到目前为止,还不错。现在我想将 event.time.utc 转换为我的时区:

> df$new.time <- as.POSIXct(df$event.time.utc, tz = "America/Edmonton")
> head(df)
       event.time.utc time.verify            new.time
1 2016-08-16 22:15:24    15:15:24 2016-08-16 22:15:24
2 2017-03-09 23:57:05    16:57:05 2017-03-09 23:57:05
3 2016-12-02 14:40:09    07:40:09 2016-12-02 14:40:09
4 2016-08-13 12:06:53    05:06:53 2016-08-13 12:06:53
5 2016-09-03 23:02:16    16:02:16 2016-09-03 23:02:16
6 2017-01-14 12:16:59    05:16:59 2017-01-14 12:16:59

这并没有产生我所期望的结果。我做错了什么?

更新 1 只是根据其中一条评论中链接的 post 对此进行跟进:

> df$time.edmonton <- as.POSIXct(as.integer(df$event.time.utc), origin="1970-01-01", tz="America/Edmonton")
> 
> df$time.la <- as.POSIXct(as.integer(df$event.time.utc), origin="1970-01-01", tz="America/Los_Angeles")
> 
> df
        event.time.utc time.verify       time.edmonton             time.la
1  2016-08-16 22:15:24    15:15:24 2016-08-16 22:15:24 2016-08-16 21:15:24
2  2017-03-09 23:57:05    16:57:05 2017-03-09 23:57:05 2017-03-09 22:57:05
3  2016-12-02 14:40:09    07:40:09 2016-12-02 14:40:09 2016-12-02 13:40:09
4  2016-08-13 12:06:53    05:06:53 2016-08-13 12:06:53 2016-08-13 11:06:53
5  2016-09-03 23:02:16    16:02:16 2016-09-03 23:02:16 2016-09-03 22:02:16
6  2017-01-14 12:16:59    05:16:59 2017-01-14 12:16:59 2017-01-14 11:16:59
7  2016-10-04 12:57:46    05:57:46 2016-10-04 12:57:46 2016-10-04 11:57:46
8  2016-10-03 00:35:28    17:35:28 2016-10-03 00:35:28 2016-10-02 23:35:28
9  2016-09-04 20:07:05    13:07:05 2016-09-04 20:07:05 2016-09-04 19:07:05
10 2017-02-17 17:38:31    10:38:31 2017-02-17 17:38:31 2017-02-17 16:38:31

更新 2 通过在日期时间向量上强制设置时区,我似乎更接近了:

df$event.time.utc <- force_tz(df$event.time.utc, tzone = "UTC")
df$time.edmonton <- as.POSIXct(as.integer(df$event.time.utc), origin="1970-01-01", tz="America/Edmonton")
df
        event.time.utc time.verify       time.edmonton
1  2016-08-16 22:15:24    15:15:24 2016-08-16 16:15:24
2  2017-03-09 23:57:05    16:57:05 2017-03-09 16:57:05
3  2016-12-02 14:40:09    07:40:09 2016-12-02 07:40:09
4  2016-08-13 12:06:53    05:06:53 2016-08-13 06:06:53
5  2016-09-03 23:02:16    16:02:16 2016-09-03 17:02:16
6  2017-01-14 12:16:59    05:16:59 2017-01-14 05:16:59
7  2016-10-04 12:57:46    05:57:46 2016-10-04 06:57:46
8  2016-10-03 00:35:28    17:35:28 2016-10-02 18:35:28
9  2016-09-04 20:07:05    13:07:05 2016-09-04 14:07:05
10 2017-02-17 17:38:31    10:38:31 2017-02-17 10:38:31

更新 3 非常接近,除了 DST 问题:

> df$time.edmonton <- format(df$event.time.utc, tz="America/Edmonton",usetz=TRUE)
> df
        event.time.utc time.verify           time.edmonton
1  2016-08-16 22:15:24    15:15:24 2016-08-16 16:15:24 MDT
2  2017-03-09 23:57:05    16:57:05 2017-03-09 16:57:05 MST
3  2016-12-02 14:40:09    07:40:09 2016-12-02 07:40:09 MST
4  2016-08-13 12:06:53    05:06:53 2016-08-13 06:06:53 MDT
5  2016-09-03 23:02:16    16:02:16 2016-09-03 17:02:16 MDT
6  2017-01-14 12:16:59    05:16:59 2017-01-14 05:16:59 MST
7  2016-10-04 12:57:46    05:57:46 2016-10-04 06:57:46 MDT
8  2016-10-03 00:35:28    17:35:28 2016-10-02 18:35:28 MDT
9  2016-09-04 20:07:05    13:07:05 2016-09-04 14:07:05 MDT
10 2017-02-17 17:38:31    10:38:31 2017-02-17 10:38:31 MST

您可以看出数据结构中的问题。您需要定义时区,否则它将是您的系统时区。所以这就是你需要先在 df 上做的;

#This is how I originally made the tzone to be UTC
df$event.time.utc <- as.POSIXlt(df$event.time.utc)
attr(df$event.time.utc, "tzone") <- "UTC"
df$event.time.utc <- as.POSIXct(df$event.time.utc)

#Your solution using lubridate package and force_tz function also works
require(lubridate)
df$event.time.utc <- force_tz(df$event.time.utc, tzone = "UTC")

如果您在数据结构中定义会改变其值的时区;但使用这种方法,您只需更改时区而不是时间本身。

关于 time.verifytime.edmonton 之间几行的差异,您需要查看加拿大和英国之间的夏令时差异。

这个 post 可以对此有所帮助。