date_format 不处理带有 `00:00:00` 的时间戳
date_format doesn't handle timestamp with `00:00:00`
它将类型 timestamp
的 2020-01-27 00:00:00
格式化为 2020-01-27 12:00:00
而不是 2020-01-27 00:00:00
import spark.sqlContext.implicits._
import java.sql.Timestamp
import org.apache.spark.sql.functions.typedLit
scala> val stamp = typedLit(new Timestamp(1580105949000L))
stamp: org.apache.spark.sql.Column = TIMESTAMP('2020-01-27 00:19:09.0')
scala> var df_test = Seq(5).toDF("seq").select(
| stamp.as("unixtime"),
| date_trunc("HOUR", stamp).as("date_trunc"),
| date_format(date_trunc("HOUR", stamp), "yyyy-MM-dd hh:mm:ss").as("hour")
| )
df_test: org.apache.spark.sql.DataFrame = [unixtime: timestamp, date_trunc: timestamp ... 1 more field]
scala> df_test.show
+-------------------+-------------------+-------------------+
| unixtime| date_trunc| hour|
+-------------------+-------------------+-------------------+
|2020-01-27 00:19:09|2020-01-27 00:00:00|2020-01-27 12:00:00|
+-------------------+-------------------+-------------------+
你的模式应该是yyyy-MM-dd HH:mm:ss
。
date_format
,根据 its documentation,使用 java.text.SimpleDateFormat
支持的说明符:
Converts a date/timestamp/string to a value of string in the format specified by the date format given by the second argument.
See SimpleDateFormat for valid date and time format patterns.
SimpleDateFormat
的文档可以找到 here
hh
用于 "Hour in am/pm (1-12)"。您正在寻找一天中的小时说明符,即 HH
.
它将类型 timestamp
的 2020-01-27 00:00:00
格式化为 2020-01-27 12:00:00
而不是 2020-01-27 00:00:00
import spark.sqlContext.implicits._
import java.sql.Timestamp
import org.apache.spark.sql.functions.typedLit
scala> val stamp = typedLit(new Timestamp(1580105949000L))
stamp: org.apache.spark.sql.Column = TIMESTAMP('2020-01-27 00:19:09.0')
scala> var df_test = Seq(5).toDF("seq").select(
| stamp.as("unixtime"),
| date_trunc("HOUR", stamp).as("date_trunc"),
| date_format(date_trunc("HOUR", stamp), "yyyy-MM-dd hh:mm:ss").as("hour")
| )
df_test: org.apache.spark.sql.DataFrame = [unixtime: timestamp, date_trunc: timestamp ... 1 more field]
scala> df_test.show
+-------------------+-------------------+-------------------+
| unixtime| date_trunc| hour|
+-------------------+-------------------+-------------------+
|2020-01-27 00:19:09|2020-01-27 00:00:00|2020-01-27 12:00:00|
+-------------------+-------------------+-------------------+
你的模式应该是yyyy-MM-dd HH:mm:ss
。
date_format
,根据 its documentation,使用 java.text.SimpleDateFormat
支持的说明符:
Converts a date/timestamp/string to a value of string in the format specified by the date format given by the second argument.
See SimpleDateFormat for valid date and time format patterns.
SimpleDateFormat
的文档可以找到 here
hh
用于 "Hour in am/pm (1-12)"。您正在寻找一天中的小时说明符,即 HH
.