如何使用具有空值和 UPSERT 的聚合键？

Question

我在 Postgresql 9.5 中使用 UPSERT 时遇到问题。

我有一个包含 50 列的 table，我的聚合键包含 20 个键，其中 15 个可以为空。

所以这是我的 table :

CREATE TABLE public.test
(
  id serial NOT NULL,
  stamp timestamp without time zone,
  foo_id integer,
  bar_id integer,
  ...
  CONSTRAINT id_pk PRIMARY KEY (id),
  CONSTRAINT test_agg_key_unique UNIQUE (stamp, foo_id, bar_id, ...)
);

之后我将使用我的聚合键创建部分索引。但是我之前需要创建一个唯一约束，因为所有键都不是 NOT NULL

alter table public.test ADD CONSTRAINT test_agg_key_unique UNIQUE (stamp, foo_id, bar_id, ...);

然后：

CREATE UNIQUE INDEX test_agg_key on lvl1_conversion.conversion (coalesce(stamp, '1980-01-01 01:01:01'), coalesce(foo_id, -1), coalesce(bar_id, -1), ...);

现在我可以执行我的 UPSERT :

INSERT INTO public.test as t (id, stamp, foo_id, bar_id, ...)
VALUES (RANDOM_ID, '2016-01-01 01:01:01', 1, 1, ...)
ON CONFLICT (stamp, foo_id, bar_id, ...)
  do update set another_column = t.another_column  + 1
    where t.stamp = '2016-01-01 01:01:01' and t.foo_id = 1 and t.bar_id= 1 and ...;

因此，如果聚合键已经存在，他将更新该行，如果它将插入一个新行。但是当我使用相同的查询但具有一个或多个 null 值时，我收到此异常：

ERROR:  duplicate key value violates unique constraint "test_agg_key_unique"

因为这个异常，它永远不会调用 do update

另一个很好的例子： https://dba.stackexchange.com/questions/151431/postgresql-upsert-issue-with-null-values

Answer 1

通过 isnull 函数处理具有空值的列并赋予它们默认值，例如：

INSERT INTO public.test as t (id, stamp, foo_id, bar_id, ...)
VALUES (RANDOM_ID, '2016-01-01 01:01:01', 1, 1, ...)
ON CONFLICT (stamp, foo_id, bar_id, ...)
do update set another_column = isnull(t.another_column,0)  + 1
where t.stamp = '2016-01-01 01:01:01' and t.foo_id = 1 and t.bar_id= 1 and ...;

Answer 2

我能看到的唯一方法是使用触发器使列实际上不可为空，正式保持为空。

测试table：

create table test
(
    id serial not null,
    stamp timestamp without time zone,
    foo_id integer,
    bar_id integer,
    another_column integer,
    constraint id_pk primary key (id),
    constraint test_agg_key_unique unique (stamp, foo_id, bar_id)
);

触发器：

create or replace function before_insert_on_test()
returns trigger language plpgsql as $$
begin
    new.stamp:= coalesce(new.stamp, '1980-01-01 01:01:01');
    new.foo_id:= coalesce(new.foo_id, -1);
    new.bar_id:= coalesce(new.bar_id, -1);
    return new;
end $$;

create trigger before_insert_on_test
before insert on test
for each row
execute procedure before_insert_on_test();

您不需要额外的唯一索引：

insert into test values (default, null, 1, null, 0)
on conflict (stamp, foo_id, bar_id) do
    update set another_column = test.another_column+ 1
returning *;

 id |        stamp        | foo_id | bar_id | another_column 
----+---------------------+--------+--------+----------------
  1 | 1980-01-01 01:01:01 |      1 |     -1 |              0

insert into test values (default, null, 1, null, 0)
on conflict (stamp, foo_id, bar_id) do
    update set another_column = test.another_column+ 1
returning *;

 id |        stamp        | foo_id | bar_id | another_column 
----+---------------------+--------+--------+----------------
  1 | 1980-01-01 01:01:01 |      1 |     -1 |              1

请注意，您不需要 where 子句，因为 update 仅涉及有冲突的行。

更新：替代解决方案

问题源于这样一个事实，即包含可为空元素的复合唯一索引通常不是一个好主意。您应该放弃这种方法并抵制触发器上的所有逻辑。

删除唯一索引并创建触发器：

create or replace function before_insert_on_test()
returns trigger language plpgsql as $$
declare
    found_id integer;
begin
    select id
    from test
    where
        coalesce(stamp, '1980-01-01 01:01:01') = coalesce(new.stamp, '1980-01-01 01:01:01')
        and coalesce(foo_id, -1) = coalesce(new.foo_id, -1)
        and coalesce(bar_id, -1) = coalesce(new.bar_id, -1)
    into found_id;
    if found then
        update test
        set another_column = another_column+ 1
        where id = found_id;
        return null;  -- abandon insert
    end if;
    return new;
end $$; 

create trigger before_insert_on_test
before insert on test
for each row
execute procedure before_insert_on_test();

只使用insert，不用on conflict。

您可以尝试使用（非唯一）索引加速触发器：

create index on test(coalesce(stamp, '1980-01-01 01:01:01'), coalesce(foo_id, -1), coalesce(bar_id, -1));

Answer 3

读完这个问题后：here我找到了解决方案。

谢谢 Erwin Brandstetter：https://dba.stackexchange.com/a/151438/107395

解决方案：

所以我需要创建一个包含所有键的索引并为每个可以为空的列添加 COALESCE。

因此，如果它是文本 COALESCE(test_field, '') 或者数字 COALESCE(test_field, -1)

CREATE UNIQUE INDEX test_upsert_solution_idx
    ON test_upsert (name, status, COALESCE(test_field, ''), COALESCE(test_field2, '')...);

并在 UPSERT 中删除 DO UPDATE 中的 WHERE 并将 COALESCE 添加到 ON CONFLICT 中：

INSERT INTO test_upsert as tu(name, status, test_field, identifier, count) 
VALUES ('test', 1, null, 'ident', 11)
ON CONFLICT (name, status, COALESCE(test_field, '')) 
 DO UPDATE  -- match expr. index
  SET count = COALESCE(tu.count + EXCLUDED.count, EXCLUDED.count, tu.count);

如何使用具有空值和 UPSERT 的聚合键？

How to use an aggregation key with null value and with UPSERT?

sql

postgresql

upsert

postgresql-9.5

更新：替代解决方案