如何在 Delta Table - Databricks 中的列之间切换名称?

How to switch names between columns in Delta Table - Databricks?

如何在Delta Lake中以最有效的方式在2列之间切换名称?假设我有以下列:

Address |   Name

我想交换名字,以便拥有:

Name    |   Address

首先我重命名了两列:

spark.read.table(„table”) \
  .withColumnRenamed("address", "name1") \
  .withColumnRenamed("name", "address1") \
  .write \
  .format("delta") \
  .mode("overwrite") \
  .option("overwriteSchema", "true") \
  .saveAsTable("table”")

然后我将已经重命名的列重命名为最后一个:

spark.read.table("table”") \
  .withColumnRenamed("name1", "name") \
  .withColumnRenamed("address1", "address") \
  .write \
  .format("delta") \
  .mode("overwrite") \
  .option("overwriteSchema", "true") \
  .saveAsTable("table”") 

如果只在 DataFrame 上使用 toDF function 来设置新名称而不是现有名称呢:

spark.read.table("table”") \
  .toDF("name", "address")
  .write....

如果您有更多列,则可以通过使用现有名称和新名称之间的映射稍微更改它,并生成正确的列列表:

mapping = {"address":"name", "name":"address"}
df = spark.read.table("table”")
new_cols = [mapping.get(cl, cl) for cl in df.columns]
df.toDF(*new_cols).write....