修改嵌套结构数组中的元素
Modifying element in nested array of struct
我有一个嵌套的结构数组,我想将列名修改为下面示例中给出的其他名称。
源格式
|-- HelloWorld: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- version: string (nullable = true)
| | |-- abc-version: string (nullable = true) ----->This part needs to renamed
| | |-- again_something: array (nullable = true)
| | | |-- element: map (containsNull = true)
| | | | |-- key: string
| | | | |-- value: string (valueContainsNull = true)
输出格式应如下所示。
|-- HelloWorld: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- version: string (nullable = true)
| | |-- abc_version: string (nullable = true). ----->This part has changed
| | |-- again_something: array (nullable = true)
| | | |-- element: map (containsNull = true)
| | | | |-- key: string
| | | | |-- value: string (valueContainsNull = true)
我尝试了不同的 withField,F.expr 来转换列名,但效果不佳。
请帮忙。
我会在更改列名时使用相同的 dtype 重铸它
df3 = df.withColumn("HelloWorld",F.expr("transform(HelloWorld, x -> struct(cast((x['abc-version']) as integer) as abc_version, x.version,x.gain_something))"))
root
|-- HelloWorld: array (nullable = true)
| |-- element: struct (containsNull = false)
| | |-- abc_version: integer (nullable = true)
| | |-- version: string (nullable = true)
| | |-- gain_something: array (nullable = true)
| | | |-- element: map (containsNull = true)
| | | | |-- key: string
| | | | |-- value: string (valueContainsNull = true)
我有一个嵌套的结构数组,我想将列名修改为下面示例中给出的其他名称。
源格式
|-- HelloWorld: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- version: string (nullable = true)
| | |-- abc-version: string (nullable = true) ----->This part needs to renamed
| | |-- again_something: array (nullable = true)
| | | |-- element: map (containsNull = true)
| | | | |-- key: string
| | | | |-- value: string (valueContainsNull = true)
输出格式应如下所示。
|-- HelloWorld: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- version: string (nullable = true)
| | |-- abc_version: string (nullable = true). ----->This part has changed
| | |-- again_something: array (nullable = true)
| | | |-- element: map (containsNull = true)
| | | | |-- key: string
| | | | |-- value: string (valueContainsNull = true)
我尝试了不同的 withField,F.expr 来转换列名,但效果不佳。
请帮忙。
我会在更改列名时使用相同的 dtype 重铸它
df3 = df.withColumn("HelloWorld",F.expr("transform(HelloWorld, x -> struct(cast((x['abc-version']) as integer) as abc_version, x.version,x.gain_something))"))
root
|-- HelloWorld: array (nullable = true)
| |-- element: struct (containsNull = false)
| | |-- abc_version: integer (nullable = true)
| | |-- version: string (nullable = true)
| | |-- gain_something: array (nullable = true)
| | | |-- element: map (containsNull = true)
| | | | |-- key: string
| | | | |-- value: string (valueContainsNull = true)