以这样的方式将日期递增一个 working/week 天日期

Increment date by one in such a way its next working/week day date

我正在使用 Spark 数据帧。我有一个用例,我需要将日期递增一个。我的增量日期恰好是周末,那么我需要将其增量到下一个 week/working 天。

val df = Seq(
  ("50312", "2021-12-01", "0.9992019"),
  ("50312", "2021-12-02", "0.20171201"),
  ("50312", "2021-12-03", "2.9992019")
).toDF("id","some_date","item_value")
.withColumn("nextworking_day", date_add(col("some_date"),1))

下一个工作日应该是下一个工作日而不是周末。怎么做?

写一个udf来查看日期应该可以解决问题 下面是 pyspark 中的示例代码 运行,不包含假期代码,但您可以创建一个列表或枚举并根据您的地区添加条件

import pyspark.sql.functions as f
from pyspark.sql.types import TimestampType
from datetime import datetime, timedelta


@f.udf(returnType=TimestampType())
def get_convert_date_udf(date_column):
  datetime_object = datetime.strptime(date_column, "%Y-%m-%d")
  new_datetime_object = datetime_object + timedelta(days=1)
  day = new_datetime_object.strftime("%A")
  if day == "Sunday":
      new_datetime_object += timedelta(days=1)
  elif day == "Saturday":
      new_datetime_object += timedelta(days=2)
  return new_datetime_object


df = df.withColumn("next_working_date", 
get_convert_date_udf(f.col("some_date")))

您可以使用 dayofweek 获取星期几,如果是星期六则加 2,如果是星期五则加 3。

val day = dayofweek(col("some_date"))
val nextworkday = col("some_date") + when(day > 5, -day + 9).otherwise(1)
val df = Seq(
  ("50312", "2021-12-01", "0.9992019"),
  ("50312", "2021-12-02", "0.20171201"),
  ("50312", "2021-12-03", "2.9992019")
).toDF("id","some_date","item_value")
.withColumn("some_date", col("some_date").cast("date"))
.withColumn("nextworking_day", nextworkday)

df.show()
+-----+----------+----------+---------------+
|   id| some_date|item_value|nextworking_day|
+-----+----------+----------+---------------+
|50312|2021-12-01| 0.9992019|     2021-12-02|
|50312|2021-12-02|0.20171201|     2021-12-03|
|50312|2021-12-03| 2.9992019|     2021-12-06|
+-----+----------+----------+---------------+