如何更新 Delta 中分区列的值?

How to update the values of partitioned columns in Delta?

我想知道,是否可以更新分区 table 的列的 "values"?

table 在特定列上分区,现在我想更新该特定列的值。我可以这样做吗?


(发现于 slack

使用 replaceWhere 选项。

引用Replace table schema的官方文档:

By default, overwriting the data in a table does not overwrite the schema. When overwriting a table using mode("overwrite") without replaceWhere, you may still want to overwrite the schema of the data being written. You replace the schema and partitioning of the table by setting the overwriteSchema option to true.

引用文章 Selectively updating Delta partitions with replaceWhere:

Delta makes it easy to update certain disk partitions with the replaceWhere option.

replaceWhere is particularly useful when you have to run a computationally expensive algorithm, but only on certain partitions.