如何将 Scala 中的列表列扩展为多行

How to expand a list column in scala to multiple rows

我想转以下列表:

val articledDF = spark.createDF(
  List(
    ("article 1", Array("topic 1", "topic 2")),
    ("article 2", Array("topic 1", "topic 3")),
    ("article 3", Array("topic 2"))
  ), List(
    ("article", StringType, true),
    ("topics", ArrayType(StringType, true), true)
  )
)

这导致:

+---------+---------------------+
| name    |topics               |
+---------+---------------------+
|article 1|   [topic 1, topic 2]|
|article 2|   [topic 1, topic 3]|
|article 3|            [topic 2]|
+---------+---------------------+

并按以下方式展开栏目主题:

+---------+-----------+
| name    |topic      |
+---------+-----------+
|article 1|   topic 1 |
|article 1|   topic 2 |
|article 2|   topic 1 |
|article 2|   topic 3 |
|article 3|   topic 2 |
+---------+-----------+

很乐意学习如何做到这一点。

使用explode:

import org.apache.spark.sql.functions._
import spark.implicits._

articledDF.select($"article", explode($"topics") as "topic")