Spark dataFrame 将列数据类型从字符串转换为日期
Spark dataFrame convert columns Datatype from String to Date
我有以下数据,架构
scala> df2.printSchema()
root
|-- RowID: integer (nullable = true)
|-- Order Date: string (nullable = true)
scala> df2.show(5)
+-----+----------+
|RowID|Order Date|
+-----+----------+
| 1| 4/10/15|
| 49| 4/10/15|
| 50| 4/10/15|
| 80| 4/10/15|
| 85| 4/10/15|
+-----+----------+
我想将 "Order Date" String 列转换为 Date 数据类型,并尝试以下但没有成功,有人可以建议更好的方法吗?
scala> df2.select(df2.col("RowID"), df2.col("Order Date"), date_format(df2.col("Order Date"), "M/dd/yy")).show(5)
+-----+----------+-------------------------------+
|RowID|Order Date|date_format(Order Date,M/dd/yy)|
+-----+----------+-------------------------------+
| 1| 4/10/15| null|
| 49| 4/10/15| null|
| 50| 4/10/15| null|
| 80| 4/10/15| null|
| 85| 4/10/15| null|
+-----+----------+-------------------------------+
设法转换为 unix 纪元时间戳,我认为从这里开始很简单
scala> df.select(df.col("RowID"), df.col("Order Date"), unix_timestamp(df.col("Order Date"), "M/d/yy")).show(5)
+-----+----------+--------------------------------+
|RowID|Order Date|unixtimestamp(Order Date,M/d/yy)|
+-----+----------+--------------------------------+
| 1| 4/10/15| 1428604200|
| 49| 4/10/15| 1428604200|
| 50| 4/10/15| 1428604200|
| 80| 4/10/15| 1428604200|
| 85| 4/10/15| 1428604200|
+-----+----------+--------------------------------+
我有以下数据,架构
scala> df2.printSchema()
root
|-- RowID: integer (nullable = true)
|-- Order Date: string (nullable = true)
scala> df2.show(5)
+-----+----------+
|RowID|Order Date|
+-----+----------+
| 1| 4/10/15|
| 49| 4/10/15|
| 50| 4/10/15|
| 80| 4/10/15|
| 85| 4/10/15|
+-----+----------+
我想将 "Order Date" String 列转换为 Date 数据类型,并尝试以下但没有成功,有人可以建议更好的方法吗?
scala> df2.select(df2.col("RowID"), df2.col("Order Date"), date_format(df2.col("Order Date"), "M/dd/yy")).show(5)
+-----+----------+-------------------------------+
|RowID|Order Date|date_format(Order Date,M/dd/yy)|
+-----+----------+-------------------------------+
| 1| 4/10/15| null|
| 49| 4/10/15| null|
| 50| 4/10/15| null|
| 80| 4/10/15| null|
| 85| 4/10/15| null|
+-----+----------+-------------------------------+
设法转换为 unix 纪元时间戳,我认为从这里开始很简单
scala> df.select(df.col("RowID"), df.col("Order Date"), unix_timestamp(df.col("Order Date"), "M/d/yy")).show(5)
+-----+----------+--------------------------------+
|RowID|Order Date|unixtimestamp(Order Date,M/d/yy)|
+-----+----------+--------------------------------+
| 1| 4/10/15| 1428604200|
| 49| 4/10/15| 1428604200|
| 50| 4/10/15| 1428604200|
| 80| 4/10/15| 1428604200|
| 85| 4/10/15| 1428604200|
+-----+----------+--------------------------------+