将小时映射到天间隔 - Spark Scala

Map hours to day intervals - Spark Scala

我有一个包含 string 小时列的数据框:

+-------+
|DepTime|
+-------+
|  13:43|
|  11:25|
|  20:09|
|  09:03|
|  14:23|
|  20:24|
|  17:53|
|  06:22|
|  19:44|
|  14:53|
+-------+

考虑到这个间隔,我想转换该列:

From 06:00 to 11:59 -> Morning
From 12:00 to 17:00 -> Afternoon
From 17:01 to 20:00 -> Evening 
From 20:01 to 05:59 -> Night

预期输出:

+------------+
|DepTime     |
+------------+
|   Afternoon|
|     Morning|
|       Night|
|     Morning|
|   Afternoon|
|       Night|
|     Evening|
|     Morning|
|     Evening|
|   Afternoon|
+------------+

我使用 rlikelit:

等函数完成了类似的字符串转换
df = df.withColumn("DayOfWeek",
                when(col("DayOfWeek").rlike("1"),lit("Monday"))
                .when(col("DayOfWeek").rlike("2"),lit("Tuesday"))
                .when(col("DayOfWeek").rlike("3"),lit("Wednesday"))
                .when(col("DayOfWeek").rlike("4"),lit ("Thursday"))
                .when(col("DayOfWeek").rlike("5"),lit("Friday"))
                .when(col("DayOfWeek").rlike("6"),lit("Saturday"))
                .when(col("DayOfWeek").rlike("7"),lit("Sunday"))
                )

对于这种情况,我正在考虑使用 if(可能使用 <> 运算符)和 otherwise 但我不知道如何形成组(范围)因为小时有特殊顺序。

感谢任何帮助。提前致谢。

试试这个:

data
.withColumn("Time", date_format(col("DepTime"), "HH:mm"))
.withColumn("PeriodOfTime",
    when(col("Time") > "06:00" && col("Time") <= "12:00", "Morning")
    .when(col("Time") > "12:00" && col("Time") <= "17:00", "Afternoon")
    .when(col("Time") > "17:00" && col("Time") <= "20:00", "Evening")
    .otherwise("Night"))
.drop("Time")

输出(测试):

+-------+------------+
|DepTime|PeriodOfTime|
+-------+------------+
|  13:43|   Afternoon|
|  11:25|     Morning|
|  20:09|       Night|
|  09:03|     Morning|
|  14:23|   Afternoon|
|  20:24|       Night|
|  17:53|     Evening|
|  06:22|     Morning|
|  19:44|     Evening|
|  14:53|   Afternoon|
+-------+------------+