如何使用具有空值和 UPSERT 的聚合键?

How to use an aggregation key with null value and with UPSERT?

我在 Postgresql 9.5 中使用 UPSERT 时遇到问题。

我有一个包含 50 列的 table,我的聚合键包含 20 个键,其中 15 个可以为空。

所以这是我的 table :

CREATE TABLE public.test
(
  id serial NOT NULL,
  stamp timestamp without time zone,
  foo_id integer,
  bar_id integer,
  ...
  CONSTRAINT id_pk PRIMARY KEY (id),
  CONSTRAINT test_agg_key_unique UNIQUE (stamp, foo_id, bar_id, ...)
);

之后我将使用我的聚合键创建部分索引。 但是我之前需要创建一个唯一约束,因为所有键都不是 NOT NULL

alter table public.test ADD CONSTRAINT test_agg_key_unique UNIQUE (stamp, foo_id, bar_id, ...);

然后:

CREATE UNIQUE INDEX test_agg_key on lvl1_conversion.conversion (coalesce(stamp, '1980-01-01 01:01:01'), coalesce(foo_id, -1), coalesce(bar_id, -1), ...);

现在我可以执行我的 UPSERT :

INSERT INTO public.test as t (id, stamp, foo_id, bar_id, ...)
VALUES (RANDOM_ID, '2016-01-01 01:01:01', 1, 1, ...)
ON CONFLICT (stamp, foo_id, bar_id, ...)
  do update set another_column = t.another_column  + 1
    where t.stamp = '2016-01-01 01:01:01' and t.foo_id = 1 and t.bar_id= 1 and ...;

因此,如果聚合键已经存在,他将更新该行,如果它将插入一个新行。但是当我使用相同的查询但具有一个或多个 null 值时,我收到此异常:

ERROR:  duplicate key value violates unique constraint "test_agg_key_unique"

因为这个异常,它永远不会调用 do update

另一个很好的例子: https://dba.stackexchange.com/questions/151431/postgresql-upsert-issue-with-null-values

通过 isnull 函数处理具有空值的列并赋予它们默认值,例如:

INSERT INTO public.test as t (id, stamp, foo_id, bar_id, ...)
VALUES (RANDOM_ID, '2016-01-01 01:01:01', 1, 1, ...)
ON CONFLICT (stamp, foo_id, bar_id, ...)
do update set another_column = isnull(t.another_column,0)  + 1
where t.stamp = '2016-01-01 01:01:01' and t.foo_id = 1 and t.bar_id= 1 and ...;

我能看到的唯一方法是使用触发器使列实际上不可为空,正式保持为空。

测试table:

create table test
(
    id serial not null,
    stamp timestamp without time zone,
    foo_id integer,
    bar_id integer,
    another_column integer,
    constraint id_pk primary key (id),
    constraint test_agg_key_unique unique (stamp, foo_id, bar_id)
);

触发器:

create or replace function before_insert_on_test()
returns trigger language plpgsql as $$
begin
    new.stamp:= coalesce(new.stamp, '1980-01-01 01:01:01');
    new.foo_id:= coalesce(new.foo_id, -1);
    new.bar_id:= coalesce(new.bar_id, -1);
    return new;
end $$;

create trigger before_insert_on_test
before insert on test
for each row
execute procedure before_insert_on_test();

您不需要额外的唯一索引:

insert into test values (default, null, 1, null, 0)
on conflict (stamp, foo_id, bar_id) do
    update set another_column = test.another_column+ 1
returning *;

 id |        stamp        | foo_id | bar_id | another_column 
----+---------------------+--------+--------+----------------
  1 | 1980-01-01 01:01:01 |      1 |     -1 |              0

insert into test values (default, null, 1, null, 0)
on conflict (stamp, foo_id, bar_id) do
    update set another_column = test.another_column+ 1
returning *;

 id |        stamp        | foo_id | bar_id | another_column 
----+---------------------+--------+--------+----------------
  1 | 1980-01-01 01:01:01 |      1 |     -1 |              1

请注意,您不需要 where 子句,因为 update 仅涉及有冲突的行。


更新:替代解决方案

问题源于这样一个事实,即包含可为空元素的复合唯一索引通常不是一个好主意。您应该放弃这种方法并抵制触发器上的所有逻辑。

删除唯一索引并创建触发器:

create or replace function before_insert_on_test()
returns trigger language plpgsql as $$
declare
    found_id integer;
begin
    select id
    from test
    where
        coalesce(stamp, '1980-01-01 01:01:01') = coalesce(new.stamp, '1980-01-01 01:01:01')
        and coalesce(foo_id, -1) = coalesce(new.foo_id, -1)
        and coalesce(bar_id, -1) = coalesce(new.bar_id, -1)
    into found_id;
    if found then
        update test
        set another_column = another_column+ 1
        where id = found_id;
        return null;  -- abandon insert
    end if;
    return new;
end $$; 

create trigger before_insert_on_test
before insert on test
for each row
execute procedure before_insert_on_test();

只使用insert,不用on conflict

您可以尝试使用(非唯一)索引加速触发器:

create index on test(coalesce(stamp, '1980-01-01 01:01:01'), coalesce(foo_id, -1), coalesce(bar_id, -1));

读完这个问题后:here我找到了解决方案。

谢谢 Erwin Brandstetter:https://dba.stackexchange.com/a/151438/107395

解决方案:

所以我需要创建一个包含所有键的索引并为每个可以为空的列添加 COALESCE

因此,如果它是文本 COALESCE(test_field, '') 或者数字 COALESCE(test_field, -1)

CREATE UNIQUE INDEX test_upsert_solution_idx
    ON test_upsert (name, status, COALESCE(test_field, ''), COALESCE(test_field2, '')...);

并在 UPSERT 中删除 DO UPDATE 中的 WHERE 并将 COALESCE 添加到 ON CONFLICT 中:

INSERT INTO test_upsert as tu(name, status, test_field, identifier, count) 
VALUES ('test', 1, null, 'ident', 11)
ON CONFLICT (name, status, COALESCE(test_field, '')) 
 DO UPDATE  -- match expr. index
  SET count = COALESCE(tu.count + EXCLUDED.count, EXCLUDED.count, tu.count);