Postgres：更新一行并更新主键列

Question

假设我的 Postgres 数据库中有两个 table：

create table transactions
(
    id bigint primary key,
    doc_id bigint not null,
    -- lots of other columns...
    amount numeric not null
);

-- same columns
create temporary table updated_transactions
(
    id bigint primary key,
    doc_id bigint not null,
    -- lots of other columns...
    amount numeric not null
);

两个 table 都只有一个主键，没有唯一索引。

我需要使用以下规则将 updated_transactions 中的行更新到 transactions 中：

transactions 和 updated_transactions 中的 id 列值不匹配
其他列如 doc_id 等（amount 除外）应匹配
找到匹配行后，更新 amount 和 id 列
当找不到匹配的行时，插入它

id updated_transactions 中的值取自序列。业务对象只是填充 updated_transactions 然后合并使用 upsert 查询将新的或更新的行从它变成 transactions。因此，我的旧未更改交易保持其 id 完好无损，而更新后的交易分配了新的 ids.

在 MSSQL 和 Oracle 中，这将是一个 merge 类似于这样的语句：

merge into transactions t
using updated_transactions ut on t.doc_id = ut.doc_id, ...
when matched then
    update set t.id = ut.id, t.amount = ut.amount
when not matched then
    insert (t.id, t.doc_id, ..., t.amount)
    values (ut.id, ut.doc_id, ..., ut.amount);

在 PostgreSQL 中，我想应该是这样的：

insert into transactions(id, doc_id, ..., amount)
select coalesce(t.id, ut.id), ut.doc_id, ... ut.amount
from updated_transactions ut
left join transactions t on t.doc_id = ut.doc_id, ....
    on conflict
    on constraint transactions_pkey
    do update
    set amount = excluded.amount, id = excluded.id

问题出在 do update 子句上：excluded.id 是旧值来自 transactions table，而我需要来自 updated_transactions.

的新值

ut.id 值对于 do update 子句是不可访问的，我唯一能做的使用的是 excluded 行。但是 excluded 行只有 coalesce(t.id, ut.id) 表达式 which returns old id values for the existing rows.

是否可以使用更新插入查询同时更新 id 和 amount 列？

Answer 1

在用作键的那些列上创建唯一索引，并在 upsert 表达式中传递它的名称，以便它使用它而不是 pkey。如果未找到匹配项，它将使用 updated_transactions 中的 ID 插入行。如果找到匹配项，则可以使用 excluded.id 从 updated_transactions.

获取 ID

我认为 left join transactions 是多余的。

所以它看起来有点像这样：

insert into transactions(id, doc_id, ..., amount)
select ut.id, ut.doc_id, ... ut.amount
from updated_transactions ut
    on conflict
    on constraint transactions_multi_column_unique_index
    do update
    set amount = excluded.amount, id = excluded.id

Answer 2

看起来可以使用 writable CTEs 而不是普通的 upsert 来完成任务。

首先，我将 post 更简单的查询版本来回答最初提出的问题。此解决方案假定 doc_id, unit_id 列寻址候选键，但不需要这些列上的唯一索引。

测试数据：

create temp table transactions
(
    id bigint primary key,
    doc_id bigint,
    unit_id bigint,
    amount numeric
);

create temp table updated_transactions
(
    id bigint primary key,
    doc_id bigint,
    unit_id bigint,
    amount numeric
); 

insert into transactions(id, doc_id, unit_id, amount)
values (1, 1, 1, 10), (2, 1, 2, 15), (3, 1, 3, 10);

insert into updated_transactions(id, doc_id, unit_id, amount)
values (6, 1, 1, 11), (7, 1, 2, 15), (8, 1, 4, 20);

将 updated_transactions 合并到 transactions 的查询：

with new_values as
(
    select ut.id new_id, t.id old_id, ut.doc_id, ut.unit_id, ut.amount 
    from updated_transactions ut
    left join transactions t 
        on t.doc_id = ut.doc_id and t.unit_id = ut.unit_id
),
updated as
(
    update transactions tr
    set id = nv.new_id, amount = nv.amount
    from new_values nv
    where id = nv.old_id
    returning tr.*
)
insert into transactions(id, doc_id, unit_id, amount)
select ut.new_id, ut.doc_id, ut.unit_id, ut.amount
from new_values ut
where ut.new_id not in (select id from updated);

结果：

select * from transactions

-- id | doc_id | unit_id | amount
------+--------+---------+-------
--  3 |   1    |    3    |  10    -- not changed
--  6 |   1    |    1    |  11    -- updated
--  7 |   1    |    2    |  15    -- updated 
--  8 |   1    |    4    |  20    -- inserted

在我的实际应用中，doc_id, unit_id 并不总是唯一的，因此它们不代表候选键。为了匹配行，我考虑了行号，为按 id 排序的行计算。所以这是我的第二个解决方案。

测试数据：

-- the tables are the same as above
insert into transactions(id, doc_id, unit_id, amount)
values (1, 1, 1, 10), (2, 1, 1, 15), (3, 1, 3, 10);

insert into updated_transactions(id, doc_id, unit_id, amount)
values (6, 1, 1, 11), (7, 1, 1, 15), (8, 1, 4, 20);

合并查询：

with trans as
(
    select id, doc_id, unit_id, amount,
        row_number() over(partition by doc_id, unit_id order by id) row_num
    from transactions
),
updated_trans as
(
    select id, doc_id, unit_id, amount,
        row_number() over(partition by doc_id, unit_id order by id) row_num
    from updated_transactions
),
new_values as
(
    select ut.id new_id, t.id old_id, ut.doc_id, ut.unit_id, ut.amount 
    from updated_trans ut
    left join trans t 
        on t.doc_id = ut.doc_id and t.unit_id = ut.unit_id and t.row_num = ut.row_num
),
updated as
(
    update transactions tr
    set id = nv.new_id, amount = nv.amount
    from new_values nv
    where id = nv.old_id
    returning tr.*
)
insert into transactions(id, doc_id, unit_id, amount)
select ut.new_id, ut.doc_id, ut.unit_id, ut.amount
from new_values ut
where ut.new_id not in (select id from updated);

结果：

select * from transactions;

-- id | doc_id | unit_id | amount
------+--------+---------+-------
--  3 |   1    |    3    | 10     -- not changed
--  6 |   1    |    1    | 11     -- updated
--  7 |   1    |    1    | 15     -- updated
--  8 |   1    |    4    | 20     -- inserted

参考文献：

Insert on duplicate update in PostgreSQL
Upserting via Writeable CTE
Waiting for 9.1 — Writable CTE
Why is UPSERT so complicated?

Postgres：更新一行并更新主键列

Postgres: upsert a row and update a primary key column

sql

postgresql

merge

upsert