在 R 中强制转换为 data.frame 时,来自日期序列 POSIXct 的时区丢失
timezone from dates squence POSIXct lost when coercing to data.frame in R
我想保留下面生成的数据序列的CET、CEST部分。
seq(as.POSIXct("2018-10-01"), as.POSIXct("2018-10-02"), "hour")
myvector <- seq(as.POSIXct("2018-10-01"), as.POSIXct("2018-10-02"), "hour")
myvector
mydf <- as.data.frame(myvector)
在控制台中看起来像:
> head(seq(...))
[1] "2018-10-01 00:00:00 CEST" "2018-10-01 01:00:00 CEST" "2018-10-01 02:00:00 CEST" "2018-10-01 03:00:00 CEST" "2018-10-01 04:00:00 CEST" "2018-10-01 05:00:00 CEST"
> head(myvector)
[1] "2018-10-01 00:00:00 CEST" "2018-10-01 01:00:00 CEST" "2018-10-01 02:00:00 CEST" "2018-10-01 03:00:00 CEST" "2018-10-01 04:00:00 CEST" "2018-10-01 05:00:00 CEST"
> head(mydf)
myvector
1 2018-10-01 00:00:00
2 2018-10-01 01:00:00
3 2018-10-01 02:00:00
4 2018-10-01 03:00:00
5 2018-10-01 04:00:00
6 2018-10-01 05:00:00
>
当我将其强制为 data.frame 时,它会丢失。我不知道如何保存它,我尝试过类似的方法:
attr(mydf$myvector, "tzone") <- attr(myvector, "tzone")
但 tzone
并不是真正的属性,因此它不起作用。
POSIXct
中的CEST/CET
是什么?我如何在强制 df 时保留它?
谢谢
您需要在 POSIXct
列上应用 as.POSIXlt
,然后才能从中获取时区
#Extract timezone from POSIXct column of a dataframe
mydf$timezone <- attr(as.POSIXlt(mydf$myvector), "tzone")[1]
head(mydf)
# myvector timezone
#1 2018-10-01 00:00:00 Europe/Berlin
#2 2018-10-01 01:00:00 Europe/Berlin
#3 2018-10-01 02:00:00 Europe/Berlin
#4 2018-10-01 03:00:00 Europe/Berlin
#5 2018-10-01 04:00:00 Europe/Berlin
#6 2018-10-01 05:00:00 Europe/Berlin
示例数据:
myvector <- seq(as.POSIXct("2018-10-01"), as.POSIXct("2018-10-02"), "hour")
head(myvector)
#[1] "2018-10-01 00:00:00 CEST" "2018-10-01 01:00:00 CEST" "2018-10-01 02:00:00 CEST"
#[4] "2018-10-01 03:00:00 CEST" "2018-10-01 04:00:00 CEST" "2018-10-01 05:00:00 CEST"
mydf <- as.data.frame(myvector)
head(mydf$myvector)
#[1] "2018-10-01 00:00:00 CEST" "2018-10-01 01:00:00 CEST" "2018-10-01 02:00:00 CEST"
#[4] "2018-10-01 03:00:00 CEST" "2018-10-01 04:00:00 CEST" "2018-10-01 05:00:00 CEST"
替代方法:如果你真的关心CET
或CEST
只输出
mydf$timezone <- gsub("^.*\s", "", format(mydf$myvector, usetz = TRUE))
head(mydf)
# myvector timezone
#1 2018-10-01 00:00:00 CEST
#2 2018-10-01 01:00:00 CEST
#3 2018-10-01 02:00:00 CEST
#4 2018-10-01 03:00:00 CEST
#5 2018-10-01 04:00:00 CEST
#6 2018-10-01 05:00:00 CEST
我想保留下面生成的数据序列的CET、CEST部分。
seq(as.POSIXct("2018-10-01"), as.POSIXct("2018-10-02"), "hour")
myvector <- seq(as.POSIXct("2018-10-01"), as.POSIXct("2018-10-02"), "hour")
myvector
mydf <- as.data.frame(myvector)
在控制台中看起来像:
> head(seq(...))
[1] "2018-10-01 00:00:00 CEST" "2018-10-01 01:00:00 CEST" "2018-10-01 02:00:00 CEST" "2018-10-01 03:00:00 CEST" "2018-10-01 04:00:00 CEST" "2018-10-01 05:00:00 CEST"
> head(myvector)
[1] "2018-10-01 00:00:00 CEST" "2018-10-01 01:00:00 CEST" "2018-10-01 02:00:00 CEST" "2018-10-01 03:00:00 CEST" "2018-10-01 04:00:00 CEST" "2018-10-01 05:00:00 CEST"
> head(mydf)
myvector
1 2018-10-01 00:00:00
2 2018-10-01 01:00:00
3 2018-10-01 02:00:00
4 2018-10-01 03:00:00
5 2018-10-01 04:00:00
6 2018-10-01 05:00:00
>
当我将其强制为 data.frame 时,它会丢失。我不知道如何保存它,我尝试过类似的方法:
attr(mydf$myvector, "tzone") <- attr(myvector, "tzone")
但 tzone
并不是真正的属性,因此它不起作用。
POSIXct
中的CEST/CET
是什么?我如何在强制 df 时保留它?
谢谢
您需要在 POSIXct
列上应用 as.POSIXlt
,然后才能从中获取时区
#Extract timezone from POSIXct column of a dataframe
mydf$timezone <- attr(as.POSIXlt(mydf$myvector), "tzone")[1]
head(mydf)
# myvector timezone
#1 2018-10-01 00:00:00 Europe/Berlin
#2 2018-10-01 01:00:00 Europe/Berlin
#3 2018-10-01 02:00:00 Europe/Berlin
#4 2018-10-01 03:00:00 Europe/Berlin
#5 2018-10-01 04:00:00 Europe/Berlin
#6 2018-10-01 05:00:00 Europe/Berlin
示例数据:
myvector <- seq(as.POSIXct("2018-10-01"), as.POSIXct("2018-10-02"), "hour")
head(myvector)
#[1] "2018-10-01 00:00:00 CEST" "2018-10-01 01:00:00 CEST" "2018-10-01 02:00:00 CEST"
#[4] "2018-10-01 03:00:00 CEST" "2018-10-01 04:00:00 CEST" "2018-10-01 05:00:00 CEST"
mydf <- as.data.frame(myvector)
head(mydf$myvector)
#[1] "2018-10-01 00:00:00 CEST" "2018-10-01 01:00:00 CEST" "2018-10-01 02:00:00 CEST"
#[4] "2018-10-01 03:00:00 CEST" "2018-10-01 04:00:00 CEST" "2018-10-01 05:00:00 CEST"
替代方法:如果你真的关心CET
或CEST
只输出
mydf$timezone <- gsub("^.*\s", "", format(mydf$myvector, usetz = TRUE))
head(mydf)
# myvector timezone
#1 2018-10-01 00:00:00 CEST
#2 2018-10-01 01:00:00 CEST
#3 2018-10-01 02:00:00 CEST
#4 2018-10-01 03:00:00 CEST
#5 2018-10-01 04:00:00 CEST
#6 2018-10-01 05:00:00 CEST