MERGE INTO for databricks delta lake 的 pyspark 等价物是什么?
What is the pyspark equivalent of MERGE INTO for databricks delta lake?
databricks documentation 描述了如何对增量表进行合并。
在SQL语法中
MERGE INTO [db_name.]target_table [AS target_alias]
USING [db_name.]source_table [<time_travel_version>] [AS source_alias]
ON <merge_condition>
[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
[ WHEN NOT MATCHED [ AND <condition> ] THEN <not_matched_action> ]
可以使用。有 python 等价物吗?
在 Alexandros Biratsis 的帮助下,我设法找到了文档。可以找到文档 here。
给出了这种合并的例子
deltaTable.alias("events").merge(
source = updatesDF.alias("updates"),
condition = "events.eventId = updates.eventId"
).whenMatchedUpdate(set =
{
"data": "updates.data",
"count": "events.count + 1"
}
).whenNotMatchedInsert(values =
{
"date": "updates.date",
"eventId": "updates.eventId",
"data": "updates.data",
"count": "1"
}
).execute()
databricks documentation 描述了如何对增量表进行合并。
在SQL语法中
MERGE INTO [db_name.]target_table [AS target_alias]
USING [db_name.]source_table [<time_travel_version>] [AS source_alias]
ON <merge_condition>
[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
[ WHEN NOT MATCHED [ AND <condition> ] THEN <not_matched_action> ]
可以使用。有 python 等价物吗?
在 Alexandros Biratsis 的帮助下,我设法找到了文档。可以找到文档 here。
给出了这种合并的例子deltaTable.alias("events").merge(
source = updatesDF.alias("updates"),
condition = "events.eventId = updates.eventId"
).whenMatchedUpdate(set =
{
"data": "updates.data",
"count": "events.count + 1"
}
).whenNotMatchedInsert(values =
{
"date": "updates.date",
"eventId": "updates.eventId",
"data": "updates.data",
"count": "1"
}
).execute()