检查 Postgres 数组中是否存在 NULL

Check if NULL exists in Postgres array

类似于this question,如何判断数组中是否存在NULL值?

这里有一些尝试。

SELECT num, ar, expected,
  ar @> ARRAY[NULL]::int[] AS test1,
  NULL = ANY (ar) AS test2,
  array_to_string(ar, ', ') <> array_to_string(ar, ', ', '(null)') AS test3
FROM (
  SELECT 1 AS num, '{1,2,NULL}'::int[] AS ar, true AS expected
  UNION SELECT 2, '{1,2,3}'::int[], false
) td ORDER BY num;

 num |     ar     | expected | test1 | test2 | test3
-----+------------+----------+-------+-------+-------
   1 | {1,2,NULL} | t        | f     |       | t
   2 | {1,2,3}    | f        | f     |       | f
(2 rows)

只有 array_to_string 的技巧显示了预期值。有没有更好的方法来测试这个?

PostgreSQL 的 UNNEST() 函数更好 choice.You 可以编写一个像下面这样的简单函数来检查数组中的 NULL 值。

create or replace function NULL_EXISTS(val anyelement) returns boolean as
$$
select exists (
    select 1 from unnest(val) arr(el) where el is null
);
$$
language sql 

例如,

SELECT NULL_EXISTS(array [1,2,NULL])
      ,NULL_EXISTS(array [1,2,3]);

结果:

null_exists null_exists 
----------- -------------- 
t           f     

因此,您可以在查询中使用 NULL_EXISTS() 函数,如下所示。

SELECT num, ar, expected,NULL_EXISTS(ar)
FROM (
  SELECT 1 AS num, '{1,2,NULL}'::int[] AS ar, true AS expected
  UNION SELECT 2, '{1,2,3}'::int[], false
) td ORDER BY num;

PostgreSQL 9.5(我知道你指定了 9.1,但无论如何)有 array_position() 函数来做你想做的事,而不必使用效率极低的 unnest() 来处理像这样微不足道的事情(参见 test4):

patrick@puny:~$ psql -d test
psql (9.5.0)
Type "help" for help.

test=# SELECT num, ar, expected,
  ar @> ARRAY[NULL]::int[] AS test1,
  NULL = ANY (ar) AS test2,
  array_to_string(ar, ', ') <> array_to_string(ar, ', ', '(null)') AS test3,
  coalesce(array_position(ar, NULL::int), 0) > 0 AS test4
FROM (
  SELECT 1 AS num, '{1,2,NULL}'::int[] AS ar, true AS expected
  UNION SELECT 2, '{1,2,3}'::int[], false
) td ORDER BY num;
 num |     ar     | expected | test1 | test2 | test3 | test4
-----+------------+----------+-------+-------+-------+-------
   1 | {1,2,NULL} | t        | f     |       | t     | t
   2 | {1,2,3}    | f        | f     |       | f     | f
(2 rows)

Postgres 9.5 或更高版本

或使用array_position()。基本上:

SELECT array_position(arr, NULL) IS NOT NULL AS array_has_null

请参阅下面的演示。

Postgres 9.3 或更高版本

您可以使用 built-in 函数进行测试 array_remove() or array_replace()

Postgres 9.1 或任何版本

如果您知道一个元素永远不会存在于您的数组中,您可以使用这个fast表达式。比如说,你有一个正数数组,-1 永远不会在其中:

-1 = ANY(arr) IS NULL

详细解释的相关回答:

  • Is array all NULLs in PostgreSQL

如果您不能绝对确定,您可以退回到其中一种昂贵但安全 方法与 unnest()。喜欢:

(SELECT bool_or(x IS NULL) FROM unnest(arr) x)

或:

EXISTS (SELECT 1 FROM unnest(arr) x WHERE x IS NULL)

但是您可以使用 CASE 表达式快速且安全。使用不太可能的数字并回退到安全方法(如果它应该存在)。您可能需要单独处理案例 arr IS NULL。请参阅下面的演示。

演示

SELECT num, arr, expect
     , -1 = ANY(arr) IS NULL                                    AS t_1   --  50 ms
     , (SELECT bool_or(x IS NULL) FROM unnest(arr) x)           AS t_2   -- 754 ms
     , EXISTS (SELECT 1 FROM unnest(arr) x WHERE x IS NULL)     AS t_3   -- 521 ms
     , CASE -1 = ANY(arr)
         WHEN FALSE THEN FALSE
         WHEN TRUE THEN EXISTS (SELECT 1 FROM unnest(arr) x WHERE x IS NULL)
         ELSE NULLIF(arr IS NOT NULL, FALSE)  -- catch arr IS NULL       --  55 ms
      -- ELSE TRUE  -- simpler for columns defined NOT NULL              --  51 ms
       END                                                      AS t_91
     , array_replace(arr, NULL, 0) <> arr                       AS t_93a --  99 ms
     , array_remove(arr, NULL) <> arr                           AS t_93b --  96 ms
     , cardinality(array_remove(arr, NULL)) <> cardinality(arr) AS t_94  --  81 ms
     , COALESCE(array_position(arr, NULL::int), 0) > 0          AS t_95a --  49 ms
     , array_position(arr, NULL) IS NOT NULL                    AS t_95b --  45 ms
     , CASE WHEN arr IS NOT NULL
            THEN array_position(arr, NULL) IS NOT NULL END      AS t_95c --  48 ms
FROM  (
   VALUES (1, '{1,2,NULL}'::int[], true)     -- extended test case
        , (2, '{-1,NULL,2}'      , true)
        , (3, '{NULL}'           , true)
        , (4, '{1,2,3}'          , false)
        , (5, '{-1,2,3}'         , false)
        , (6, NULL               , null)
   ) t(num, arr, expect);

结果:

 num |  arr        | expect | t_1    | t_2  | t_3 | t_91 | t_93a | t_93b | t_94 | t_95a | t_95b | t_95c
-----+-------------+--------+--------+------+-----+------+-------+-------+------+-------+-------+-------
   1 | {1,2,NULL}  | t      | t      | t    | t   | t    | t     | t     | t    | t     | t     | t
   2 | {-1,NULL,2} | t      | f --!! | t    | t   | t    | t     | t     | t    | t     | t     | t
   3 | {NULL}      | t      | t      | t    | t   | t    | t     | t     | t    | t     | t     | t
   4 | {1,2,3}     | f      | f      | f    | f   | f    | f     | f     | f    | f     | f     | f
   5 | {-1,2,3}    | f      | f      | f    | f   | f    | f     | f     | f    | f     | f     | f
   6 | NULL        | NULL   | t --!! | NULL | f   | NULL | NULL  | NULL  | NULL | f     | f     | NULL

请注意,array_remove()array_position() 不允许用于 multi-dimensional 数组 t_93a 右侧的所有表达式仅适用于一维数组。

db<>fiddle here - Postgres 13,有更多测试
sqlfiddle

基准设置

增加的时间来自 Postgres 9.5 中 20 万行的基准测试。这是我的设置:

CREATE TABLE t AS
SELECT row_number() OVER() AS num
     , array_agg(elem) AS arr
     , bool_or(elem IS NULL) AS expected
FROM  (
   SELECT CASE WHEN random() > .95 THEN NULL ELSE g END AS elem  -- 5% NULL VALUES
        , count(*) FILTER (WHERE random() > .8)
                   OVER (ORDER BY g) AS grp  -- avg 5 element per array
   FROM   generate_series (1, 1000000) g  -- increase for big test case
   ) sub
GROUP  BY grp;

函数包装器

为了重复使用,我会在 Postgres 9.5 中创建一个这样的函数:

CREATE OR REPLACE FUNCTION f_array_has_null (anyarray)
  RETURNS bool
  LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
 'SELECT array_position(, NULL) IS NOT NULL';

PARALLEL SAFE 仅适用于 Postgres 9.6 或更高版本。

使用 polymorphic 输入类型这适用于 任何 数组类型,而不仅仅是 int[].

使其IMMUTABLE允许性能优化和索引表达式。

  • Does PostgreSQL support "accent insensitive" collations?

但不要让它成为 STRICT,这会禁用“函数内联”并影响性能,因为 array_position() 不是 STRICT 本身。参见:

  • Function executes faster without STRICT modifier?

如需抓包arr IS NULL:

CREATE OR REPLACE FUNCTION f_array_has_null (anyarray)
  RETURNS bool
  LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
 'SELECT CASE WHEN  IS NOT NULL
              THEN array_position(, NULL) IS NOT NULL END';

对于 Postgres 9.1 使用上面的 t_91 表达式。其余不变。

密切相关:

  • How to determine if NULL is contained in an array in Postgres?

我用这个

select 
    array_position(array[1,null], null) is not null

array_position - returns the subscript of the first occurrence of the second argument in the array, starting at the element indicated by the third argument or at the first element (array must be one-dimensional)