如何在 Databricks 上使用 Apache Spark / Python 将整数转换为日期
How to convert integer into date using Apache Spark / Python on Databricks
感觉比较简单的问题。正在尝试将整数列转换为纪元时间 (MM/DD/YYY)?
例如,转换 881250949 --> 12/04/1997
有什么建议吗?
使用from_unixtime和date_format函数,我们可以得到需要的结果:
SPARK_SCALA
val spark = SparkSession.builder().master("local[*]").getOrCreate()
import spark.implicits._
import org.apache.spark.sql.functions._
spark.sparkContext.setLogLevel("ERROR")
// Sample dataframe
val df = Seq(881250949).toDF("col")
df.withColumn("col", date_format(from_unixtime('col), "MM/dd/yyyy"))
.show(false)
+----------+
|col |
+----------+
|12/04/1997|
+----------+
PYSPARK
from pyspark.sql import *
from pyspark.sql.functions import *
spark = SparkSession.builder.master("local").getOrCreate()
# Sample dataframe
df = spark.createDataFrame([(1,881250949)], "id int, date int")
df.withColumn("date", date_format(from_unixtime("date"), "MM/dd/yyyy"))\
.show()
/*
+---+----------+
| id| date|
+---+----------+
| 1|12/04/1997|
+---+----------+
*/
感觉比较简单的问题。正在尝试将整数列转换为纪元时间 (MM/DD/YYY)?
例如,转换 881250949 --> 12/04/1997
有什么建议吗?
使用from_unixtime和date_format函数,我们可以得到需要的结果:
SPARK_SCALA
val spark = SparkSession.builder().master("local[*]").getOrCreate()
import spark.implicits._
import org.apache.spark.sql.functions._
spark.sparkContext.setLogLevel("ERROR")
// Sample dataframe
val df = Seq(881250949).toDF("col")
df.withColumn("col", date_format(from_unixtime('col), "MM/dd/yyyy"))
.show(false)
+----------+
|col |
+----------+
|12/04/1997|
+----------+
PYSPARK
from pyspark.sql import *
from pyspark.sql.functions import *
spark = SparkSession.builder.master("local").getOrCreate()
# Sample dataframe
df = spark.createDataFrame([(1,881250949)], "id int, date int")
df.withColumn("date", date_format(from_unixtime("date"), "MM/dd/yyyy"))\
.show()
/*
+---+----------+
| id| date|
+---+----------+
| 1|12/04/1997|
+---+----------+
*/