修改嵌套结构数组中的元素

Modifying element in nested array of struct

我有一个嵌套的结构数组,我想将列名修改为下面示例中给出的其他名称。

源格式

 |-- HelloWorld: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- version: string (nullable = true)
 |    |    |-- abc-version: string (nullable = true) ----->This part needs to renamed
 |    |    |-- again_something: array (nullable = true)
 |    |    |    |-- element: map (containsNull = true)
 |    |    |    |    |-- key: string
 |    |    |    |    |-- value: string (valueContainsNull = true)

输出格式应如下所示。

 |-- HelloWorld: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- version: string (nullable = true)
 |    |    |-- abc_version: string (nullable = true). ----->This part has changed
 |    |    |-- again_something: array (nullable = true)
 |    |    |    |-- element: map (containsNull = true)
 |    |    |    |    |-- key: string
 |    |    |    |    |-- value: string (valueContainsNull = true)

我尝试了不同的 withField,F.expr 来转换列名,但效果不佳。

请帮忙。

我会在更改列名时使用相同的 dtype 重铸它

 df3 = df.withColumn("HelloWorld",F.expr("transform(HelloWorld, x -> struct(cast((x['abc-version']) as integer) as abc_version, x.version,x.gain_something))"))


root
 |-- HelloWorld: array (nullable = true)
 |    |-- element: struct (containsNull = false)
 |    |    |-- abc_version: integer (nullable = true)
 |    |    |-- version: string (nullable = true)
 |    |    |-- gain_something: array (nullable = true)
 |    |    |    |-- element: map (containsNull = true)
 |    |    |    |    |-- key: string
 |    |    |    |    |-- value: string (valueContainsNull = true)