如何将数据移动到 Scala 中的下一列
How to move data to next column in Scala
我在尝试更新 table 时遇到问题。我收到一个新日期和一个新状态,但我还想更新日期和状态的历史记录,删除最后一个 case(date2, state2) 如果它会被填充。我需要某种循环或函数来正确执行它,而不是对每一列进行硬编码。
我的主要虽然失败了,因为我误解了这个概念。我试图创建一个有序的日期列表以定位在右列中,但是当我意识到状态时我被卡住了。
我有 运行 个想法,但仍然没有达到重要的代码。我只是无法获得任何基本代码。如果有任何帮助,我将不胜感激。
输入
val historic = Seq(("Alice", "2022-01-02", "2", "2021-04-06", "3", "2020-01-01", "1")).toDF("name", "currentDate", "currentState", "date1", "state1", "date2", "state2").show()
+-----+-----------+------------+----------+------+----------+------+
| name|currentDate|currentState| date1|state1| date2|state2|
+-----+-----------+------------+----------+------+----------+------+
|Alice| 2022-01-02| 2|2021-04-06| 3|2020-01-01| 1|
+-----+-----------+------------+----------+------+----------+------+
val newData = Seq(("Alice", "2022-02-02", "s1")).toDF("name", "date", "state").show()
+-----+----------+-----+
| name| date|state|
+-----+----------+-----+
|Alice|2022-02-02| s1|
+-----+----------+-----+
期望的输出
val expected = Seq(("Alice", "2022-02-02", "s1", "2022-01-02", "2", "2021-04-06", "3")).toDF("name", "currentDate", "currentState", "date1", "state1", "date2", "state2").show()
+-----+-----------+------------+----------+------+----------+------+
| name|currentDate|currentState| date1|state1| date2|state2|
+-----+-----------+------------+----------+------+----------+------+
|Alice| 2022-02-02| s1|2022-01-02| 2|2021-04-06| 3|
+-----+-----------+------------+----------+------+----------+------+
谢谢
这应该可以满足您的需求:
val expected = historic.join(newData, Seq("name")).select(
'name,
'currentDate as "oldDate",
'currentState as "oldState",
'date as "newDate",
'state,
'state1 as "oldS1",
'date1 as "oldD1",
).select(
'name,
'newDate as "currentDate",
'state as "currentState",
'oldState as "state1",
'oldDate as "date1",
'oldS1 as "state2",
'oldD1 as "date2"
)
它根据名称列(假设它是唯一的)将新数据添加到旧数据,然后重命名列以提供所需的结构。
我在尝试更新 table 时遇到问题。我收到一个新日期和一个新状态,但我还想更新日期和状态的历史记录,删除最后一个 case(date2, state2) 如果它会被填充。我需要某种循环或函数来正确执行它,而不是对每一列进行硬编码。
我的主要虽然失败了,因为我误解了这个概念。我试图创建一个有序的日期列表以定位在右列中,但是当我意识到状态时我被卡住了。
我有 运行 个想法,但仍然没有达到重要的代码。我只是无法获得任何基本代码。如果有任何帮助,我将不胜感激。
输入
val historic = Seq(("Alice", "2022-01-02", "2", "2021-04-06", "3", "2020-01-01", "1")).toDF("name", "currentDate", "currentState", "date1", "state1", "date2", "state2").show()
+-----+-----------+------------+----------+------+----------+------+
| name|currentDate|currentState| date1|state1| date2|state2|
+-----+-----------+------------+----------+------+----------+------+
|Alice| 2022-01-02| 2|2021-04-06| 3|2020-01-01| 1|
+-----+-----------+------------+----------+------+----------+------+
val newData = Seq(("Alice", "2022-02-02", "s1")).toDF("name", "date", "state").show()
+-----+----------+-----+
| name| date|state|
+-----+----------+-----+
|Alice|2022-02-02| s1|
+-----+----------+-----+
期望的输出
val expected = Seq(("Alice", "2022-02-02", "s1", "2022-01-02", "2", "2021-04-06", "3")).toDF("name", "currentDate", "currentState", "date1", "state1", "date2", "state2").show()
+-----+-----------+------------+----------+------+----------+------+
| name|currentDate|currentState| date1|state1| date2|state2|
+-----+-----------+------------+----------+------+----------+------+
|Alice| 2022-02-02| s1|2022-01-02| 2|2021-04-06| 3|
+-----+-----------+------------+----------+------+----------+------+
谢谢
这应该可以满足您的需求:
val expected = historic.join(newData, Seq("name")).select(
'name,
'currentDate as "oldDate",
'currentState as "oldState",
'date as "newDate",
'state,
'state1 as "oldS1",
'date1 as "oldD1",
).select(
'name,
'newDate as "currentDate",
'state as "currentState",
'oldState as "state1",
'oldDate as "date1",
'oldS1 as "state2",
'oldD1 as "date2"
)
它根据名称列(假设它是唯一的)将新数据添加到旧数据,然后重命名列以提供所需的结构。