在 class 情况下过滤结构字段数组

Filter array of struct fields in case class

我有数据结构如下所示的数据集

 case class AddressData(
                          addressId: String,
                          customerId: String,
                          address: String,
                          number: Option[Int],
                          road: Option[String],
                          city: Option[String],
                          country: Option[String]
                        )

case class CustomerDocument(
                               customerId: String,
                               forename: String,
                               surname: String,
                               address: Seq[AddressData]
                             )

架构

root
 |-- customerId: string (nullable = true)
 |-- forename: string (nullable = true)
 |-- surname: string (nullable = true)
 |-- accounts: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- customerId: string (nullable = true)
 |    |    |-- accountId: string (nullable = true)
 |    |    |-- balance: long (nullable = true)
 |-- address: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- addressId: string (nullable = true)
 |    |    |-- customerId: string (nullable = true)
 |    |    |-- address: string (nullable = true)
 |    |    |-- number: integer (nullable = true)
 |    |    |-- road: string (nullable = true)
 |    |    |-- city: string (nullable = true)
 |    |    |-- country: string (nullable = true)

示例数据:

customerId forename surname address
IND0222 Charles Piper [[ADR285,IND0222,424, Lexington Avenue, New York, United States of America]]

我需要从地址列表中筛选出一个国家(以粗体突出显示的项目,例如加拿大)并创建一个新列并将值设置为 'True'(如果该国家/地区可用)或 'False' 万一它不可用。

我不确定如何在结构数组中应用过滤条件来实现。某种形式的指导表示赞赏。谢谢

下面的代码帮我从结构数组中提取国家/地区字段。

val countryFlag = df.withcolumn("isPresent", array_contains($"address.country", "Canada"))