将 spark 数据框中的字符串转换为时间戳
Convert string in spark data frame to time stamp
我正在尝试将 spark 数据框中的字符串列转换为时间戳。我尝试了以下但没有成功。感谢解决此问题的任何帮助。
sc <- spark_connect(master = "local")
time_stamp <- c("2017-12-06T20:08:56.000", "2017-11-08T12:09:37.000")
time_tbl <- copy_to(sc, tibble(timestamp=time_stamp))
# time_tbl %>% mutate(times = strptime(timestamp, "%Y-%m-%dT%H:%M:%S"))
# 'strptime'. This function is neither a registered temporary function nor a
# permanent function registered in the database
# time_tbl %>% mutate(times = lubridate::ymd_hms(timestamp))
time_tbl %>%
mutate(times = unix_timestamp(timestamp))
time_tbl %>%
mutate(times = unix_timestamp(timestamp, "yyyy-MM-dd%THH:mm:ss"))
time_tbl %>%
mutate(times = to_timestamp(timestamp, "yyyy-MM-dd%THH:mm:ss%.000"))
# # Source: spark<?> [?? x 2]
# timestamp times
# <chr> <dbl>
# 1 2017-12-06T20:08:56.000 NaN
# 2 2017-11-08T12:09:37.000 NaN
根据@MichaelChirico 提到的link,我们可以使用unix_timestamp
、to_timestamp
和to_utc_timestamp
将字符串转换为日期时间对象。
time_stamp <- c("2017-12-06T20:08:56.000", "2017-11-08T12:09:37.000",
"2017-12-06T20:08:56.123", "2017-11-08T12:09:37.456")
time_tbl <- copy_to(sc, tibble(timestamp=time_stamp))
# > time_tbl
# # Source: spark<?> [?? x 1]
# timestamp
# <chr>
# 1 2017-12-06T20:08:56.000
# 2 2017-11-08T12:09:37.000
# 3 2017-12-06T20:08:56.123
# 4 2017-11-08T12:09:37.456
time_tbl %>%
mutate(ctimes = unix_timestamp(timestamp, "yyyy-MM-dd'T'HH:mm:ss.SSS"))
# # Source: spark<?> [?? x 2]
# timestamp ctimes
# <chr> <dbl>
# 1 2017-12-06T20:08:56.000 1512554936
# 2 2017-11-08T12:09:37.000 1510106977
# 3 2017-12-06T20:08:56.123 1512554936
# 4 2017-11-08T12:09:37.456 151010697
time_tbl %>%
mutate(ctimes = to_timestamp(timestamp, "yyyy-MM-dd'T'HH:mm:ss.SSS"))
# # Source: spark<?> [?? x 2]
# timestamp ctimes
# <chr> <dttm>
# 1 2017-12-06T20:08:56.000 2017-12-06 10:08:56
# 2 2017-11-08T12:09:37.000 2017-11-08 02:09:37
# 3 2017-12-06T20:08:56.123 2017-12-06 10:08:56
# 4 2017-11-08T12:09:37.456 2017-11-08 02:09:37
time_tbl %>%
mutate(ctimes = to_utc_timestamp(timestamp, "yyyy-MM-dd'T'HH:mm:ss.SSS"))
# #Source: spark<?> [?? x 2]
# timestamp ctimes
# <chr> <dttm>
# 1 2017-12-06T20:08:56.000 2017-12-06 10:08:56
# 2 2017-11-08T12:09:37.000 2017-11-08 02:09:37
# 3 2017-12-06T20:08:56.123 2017-12-06 10:08:56
# 4 2017-11-08T12:09:37.456 2017-11-08 02:09:37
我正在尝试将 spark 数据框中的字符串列转换为时间戳。我尝试了以下但没有成功。感谢解决此问题的任何帮助。
sc <- spark_connect(master = "local")
time_stamp <- c("2017-12-06T20:08:56.000", "2017-11-08T12:09:37.000")
time_tbl <- copy_to(sc, tibble(timestamp=time_stamp))
# time_tbl %>% mutate(times = strptime(timestamp, "%Y-%m-%dT%H:%M:%S"))
# 'strptime'. This function is neither a registered temporary function nor a
# permanent function registered in the database
# time_tbl %>% mutate(times = lubridate::ymd_hms(timestamp))
time_tbl %>%
mutate(times = unix_timestamp(timestamp))
time_tbl %>%
mutate(times = unix_timestamp(timestamp, "yyyy-MM-dd%THH:mm:ss"))
time_tbl %>%
mutate(times = to_timestamp(timestamp, "yyyy-MM-dd%THH:mm:ss%.000"))
# # Source: spark<?> [?? x 2]
# timestamp times
# <chr> <dbl>
# 1 2017-12-06T20:08:56.000 NaN
# 2 2017-11-08T12:09:37.000 NaN
根据@MichaelChirico 提到的link,我们可以使用unix_timestamp
、to_timestamp
和to_utc_timestamp
将字符串转换为日期时间对象。
time_stamp <- c("2017-12-06T20:08:56.000", "2017-11-08T12:09:37.000",
"2017-12-06T20:08:56.123", "2017-11-08T12:09:37.456")
time_tbl <- copy_to(sc, tibble(timestamp=time_stamp))
# > time_tbl
# # Source: spark<?> [?? x 1]
# timestamp
# <chr>
# 1 2017-12-06T20:08:56.000
# 2 2017-11-08T12:09:37.000
# 3 2017-12-06T20:08:56.123
# 4 2017-11-08T12:09:37.456
time_tbl %>%
mutate(ctimes = unix_timestamp(timestamp, "yyyy-MM-dd'T'HH:mm:ss.SSS"))
# # Source: spark<?> [?? x 2]
# timestamp ctimes
# <chr> <dbl>
# 1 2017-12-06T20:08:56.000 1512554936
# 2 2017-11-08T12:09:37.000 1510106977
# 3 2017-12-06T20:08:56.123 1512554936
# 4 2017-11-08T12:09:37.456 151010697
time_tbl %>%
mutate(ctimes = to_timestamp(timestamp, "yyyy-MM-dd'T'HH:mm:ss.SSS"))
# # Source: spark<?> [?? x 2]
# timestamp ctimes
# <chr> <dttm>
# 1 2017-12-06T20:08:56.000 2017-12-06 10:08:56
# 2 2017-11-08T12:09:37.000 2017-11-08 02:09:37
# 3 2017-12-06T20:08:56.123 2017-12-06 10:08:56
# 4 2017-11-08T12:09:37.456 2017-11-08 02:09:37
time_tbl %>%
mutate(ctimes = to_utc_timestamp(timestamp, "yyyy-MM-dd'T'HH:mm:ss.SSS"))
# #Source: spark<?> [?? x 2]
# timestamp ctimes
# <chr> <dttm>
# 1 2017-12-06T20:08:56.000 2017-12-06 10:08:56
# 2 2017-11-08T12:09:37.000 2017-11-08 02:09:37
# 3 2017-12-06T20:08:56.123 2017-12-06 10:08:56
# 4 2017-11-08T12:09:37.456 2017-11-08 02:09:37