Postgres - 找到数组的最小值
Postgres - find min of array
假设我有一个这样的table:
link_ids | length
------------+-----------
{1,4} | {1,2}
{2,5} | {0,1}
如何找到每个 link_ids
的最小长度?
因此最终输出类似于:
link_ids | length
------------+-----------
{1,4} | 1
{2,5} | 0
(我假设 link_ids
可以有双打,因为没有 id 列我们要即兴发挥)。
WITH r AS
(SELECT row_number() OVER() as id,
link_ids,
length from Table1)
SELECT DISTINCT ON (id) link_ids,
unnest(length)
FROM r
ORDER BY id, length;
假设 table 名称是 t
并且 link_ids
的每个值都是唯一的。
select link_ids, min(len)
from (select link_ids, unnest(length) as len from t) as t
group by link_ids;
link_ids | min
----------+-----
{2,5} | 0
{1,4} | 1
假设table喜欢:
CREATE TABLE tbl (
link_ids int[] PRIMARY KEY -- which is odd for a PK
, length int[]
, CHECK (length <> '{}'::int[]) -- rules out null and empty in length
);
查询 Postgres 9.3 或更高版本:
SELECT link_ids, min(len) AS min_length
FROM tbl t, unnest(t.length) len -- implicit LATERAL join
GROUP BY 1;
或 创建一个小函数 (Postgres 8.4+):
CREATE OR REPLACE FUNCTION arr_min(anyarray)
RETURNS anyelement LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'SELECT min(i) FROM unnest() i';
仅在 Postgres 9.6 或更高版本中添加 PARALLEL SAFE
。那么:
SELECT link_ids, arr_min(length) AS min_length FROM t;
该函数可以内联并且快速。
或,对于integer
普通长度数组,使用附加模块 intarray
and its built-in sort()
function (Postgres 8.3+):
SELECT link_ids, (sort(length))[1] AS min_length FROM t;
Erwin 回答的一个小补充 - 有时使用 unnest
的子查询甚至比横向连接更便宜。
我使用了 Erwin 的回答中的 table 定义并填充了它:
t=# insert into t select '{1}'::int[]||g,'{1}'::int[]||g from generate_series(1,9999,1) g;
INSERT 0 9999
t=# select * from t order by ctid desc limit 1;
link_ids | length
----------+----------
{1,9999} | {1,9999}
(1 row)
然后分析LATERAL JOIN:
t=# explain analyze select link_ids,max(r) from t, unnest(length) r where link_ids = '{1,9999}' group by 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=0.29..4.81 rows=1 width=33) (actual time=0.030..0.030 rows=1 loops=1)
-> Nested Loop (cost=0.29..4.30 rows=100 width=33) (actual time=0.025..0.027 rows=2 loops=1)
-> Index Scan using t_pkey on t (cost=0.29..2.30 rows=1 width=58) (actual time=0.015..0.016 rows=1 loops=1)
Index Cond: (link_ids = '{1,9999}'::integer[])
-> Function Scan on unnest r (cost=0.00..1.00 rows=100 width=4) (actual time=0.007..0.007 rows=2 loops=1)
Total runtime: 0.059 ms
(6 rows)
并尝试子查询:
t=# explain analyze select link_ids, (select max(r) from unnest(length) r) from t where link_ids = '{1,9999}';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Index Scan using t_pkey on t (cost=0.29..3.56 rows=1 width=58) (actual time=0.030..0.031 rows=1 loops=1)
Index Cond: (link_ids = '{1,9999}'::integer[])
SubPlan 1
-> Aggregate (cost=1.25..1.26 rows=1 width=4) (actual time=0.011..0.011 rows=1 loops=1)
-> Function Scan on unnest r (cost=0.00..1.00 rows=100 width=4) (actual time=0.008..0.008 rows=2 loops=1)
Total runtime: 0.060 ms
(6 rows)
最后确保结果相同:
t=# select link_ids, (select max(r) from unnest(length) r)
from t
where link_ids = '{1,9999}';
link_ids | max
----------+------
{1,9999} | 9999
(1 row)
t=# select link_ids,max(r)
from t, unnest(length) r
where link_ids = '{1,9999}'
group by 1;
link_ids | max
----------+------
{1,9999} | 9999
(1 row)
对于数组的最小值:
SELECT max(x) from unnest(array_name) as x;
假设我有一个这样的table:
link_ids | length
------------+-----------
{1,4} | {1,2}
{2,5} | {0,1}
如何找到每个 link_ids
的最小长度?
因此最终输出类似于:
link_ids | length
------------+-----------
{1,4} | 1
{2,5} | 0
(我假设 link_ids
可以有双打,因为没有 id 列我们要即兴发挥)。
WITH r AS
(SELECT row_number() OVER() as id,
link_ids,
length from Table1)
SELECT DISTINCT ON (id) link_ids,
unnest(length)
FROM r
ORDER BY id, length;
假设 table 名称是 t
并且 link_ids
的每个值都是唯一的。
select link_ids, min(len)
from (select link_ids, unnest(length) as len from t) as t
group by link_ids;
link_ids | min
----------+-----
{2,5} | 0
{1,4} | 1
假设table喜欢:
CREATE TABLE tbl (
link_ids int[] PRIMARY KEY -- which is odd for a PK
, length int[]
, CHECK (length <> '{}'::int[]) -- rules out null and empty in length
);
查询 Postgres 9.3 或更高版本:
SELECT link_ids, min(len) AS min_length
FROM tbl t, unnest(t.length) len -- implicit LATERAL join
GROUP BY 1;
或 创建一个小函数 (Postgres 8.4+):
CREATE OR REPLACE FUNCTION arr_min(anyarray)
RETURNS anyelement LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'SELECT min(i) FROM unnest() i';
仅在 Postgres 9.6 或更高版本中添加 PARALLEL SAFE
。那么:
SELECT link_ids, arr_min(length) AS min_length FROM t;
该函数可以内联并且快速。
或,对于integer
普通长度数组,使用附加模块 intarray
and its built-in sort()
function (Postgres 8.3+):
SELECT link_ids, (sort(length))[1] AS min_length FROM t;
Erwin 回答的一个小补充 - 有时使用 unnest
的子查询甚至比横向连接更便宜。
我使用了 Erwin 的回答中的 table 定义并填充了它:
t=# insert into t select '{1}'::int[]||g,'{1}'::int[]||g from generate_series(1,9999,1) g;
INSERT 0 9999
t=# select * from t order by ctid desc limit 1;
link_ids | length
----------+----------
{1,9999} | {1,9999}
(1 row)
然后分析LATERAL JOIN:
t=# explain analyze select link_ids,max(r) from t, unnest(length) r where link_ids = '{1,9999}' group by 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=0.29..4.81 rows=1 width=33) (actual time=0.030..0.030 rows=1 loops=1)
-> Nested Loop (cost=0.29..4.30 rows=100 width=33) (actual time=0.025..0.027 rows=2 loops=1)
-> Index Scan using t_pkey on t (cost=0.29..2.30 rows=1 width=58) (actual time=0.015..0.016 rows=1 loops=1)
Index Cond: (link_ids = '{1,9999}'::integer[])
-> Function Scan on unnest r (cost=0.00..1.00 rows=100 width=4) (actual time=0.007..0.007 rows=2 loops=1)
Total runtime: 0.059 ms
(6 rows)
并尝试子查询:
t=# explain analyze select link_ids, (select max(r) from unnest(length) r) from t where link_ids = '{1,9999}';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Index Scan using t_pkey on t (cost=0.29..3.56 rows=1 width=58) (actual time=0.030..0.031 rows=1 loops=1)
Index Cond: (link_ids = '{1,9999}'::integer[])
SubPlan 1
-> Aggregate (cost=1.25..1.26 rows=1 width=4) (actual time=0.011..0.011 rows=1 loops=1)
-> Function Scan on unnest r (cost=0.00..1.00 rows=100 width=4) (actual time=0.008..0.008 rows=2 loops=1)
Total runtime: 0.060 ms
(6 rows)
最后确保结果相同:
t=# select link_ids, (select max(r) from unnest(length) r)
from t
where link_ids = '{1,9999}';
link_ids | max
----------+------
{1,9999} | 9999
(1 row)
t=# select link_ids,max(r)
from t, unnest(length) r
where link_ids = '{1,9999}'
group by 1;
link_ids | max
----------+------
{1,9999} | 9999
(1 row)
对于数组的最小值:
SELECT max(x) from unnest(array_name) as x;