PostgreSQL 中的多个 DISTINCT ON 子句
Multiple DISTINCT ON clauses in PostgreSQL
是否有可能 select 行是 DISTINCT ON
一些单独的、独立的列集?
假设我想要所有符合以下条件的行:
- 区别于
(name, birth)
- 区别于
(name, height)
因此,在以下 table 中,标有红叉的行将不会明显(指示失败的子句):
name birth height
--------------------------
William 1976 1.82
James 1981 1.68
Mike 1976 1.68
Tom 1967 1.79
William 1976 1.74 ❌ (name, birth)
William 1981 1.82 ❌ (name, height)
Tom 1978 1.92
Mike 1963 1.68 ❌ (name, height)
Tom 1971 1.86
James 1981 1.77 ❌ (name, birth)
Tom 1971 1.89 ❌ (name, birth)
在上面的例子中,如果 DISTINCT ON
子句刚好是 DISTINCT ON (name, birth, height)
,那么所有的行都会被认为是不同的。
尝试过但没有成功:
SELECT DISTINCT ON (name, birth) (name, height) ...
SELECT DISTINCT ON (name, birth), (name, height) ...
SELECT DISTINCT ON ((name, birth), (name, height)) ...
SELECT DISTINCT ON (name, birth) AND (name, height) ...
SELECT DISTINCT ON (name, birth) AND ON (name, height) ...
SELECT DISTINCT ON (name, birth) DISTINCT ON (name, height) ...
SELECT DISTINCT ON (name, birth), DISTINCT ON (name, height) ...
使用派生的 table:
with my_table(name, birth, height) as (
values
('William', 1976, 1.82),
('James', 1981, 1.68),
('Mike', 1976, 1.68),
('Tom', 1967, 1.79),
('William', 1976, 1.74), -- ? (name, birth)
('William', 1981, 1.82), -- ? (name, height)
('Tom', 1978, 1.92),
('Mike', 1963, 1.68), -- ? (name, height)
('Tom', 1971, 1.86),
('James', 1981, 1.77), -- ? (name, birth)
('Tom', 1971, 1.89) -- ? (name, birth)
)
select distinct on (name, height) *
from (
select distinct on (name, birth) *
from my_table
) s
name | birth | height
---------+-------+--------
James | 1981 | 1.68
Mike | 1963 | 1.68
Tom | 1967 | 1.79
Tom | 1971 | 1.89
Tom | 1978 | 1.92
William | 1976 | 1.82
(6 rows)
, there is ambiguity in the question. The number of result rows can differ for every call. If you are satisfied with arbitrary results, 就够了。否则,您需要更紧密地定义需求。喜欢:
在 (name, birth)
上区分,首先选择最小的高度,然后选择最小的 ID 作为决胜局
或:
在 (name, height)
上区分,先选择最早的出生,然后选择最小的 ID 作为决胜局
您的 table 应该有一个主键(或 一些 唯一标识行的方法):
CREATE TEMP TABLE tbl (
<b>tbl_id serial PRIMARY KEY</b>
, name text
, birth int
, height numeric);
INSERT INTO tbl (name, birth, height)
VALUES
('William', 1976, 1.82)
, ('James', 1981, 1.68)
, ('Mike', 1976, 1.68)
, ('Tom', 1967, 1.79)
, ('William', 1976, 1.74)
, ('William', 1981, 1.82)
, ('Tom', 1978, 1.92)
, ('Mike', 1963, 1.68)
, ('Tom', 1971, 1.86)
, ('James', 1981, 1.77)
, ('Tom', 1971, 1.89);
查询:
SELECT DISTINCT ON (name, height) *
FROM (
SELECT DISTINCT ON (name, birth) *
FROM tbl
<b>ORDER BY name, birth, height, tbl_id</b> -- pick smallest height, ID as tiebreaker
) sub
<b>ORDER BY name, height, birth, tbl_id</b>; -- pick earliest birth, ID as tiebreaker
tbl_id | name | birth | height
--------+---------+-------+--------
2 | James | 1981 | 1.68
8 | Mike | 1963 | 1.68
4 | Tom | 1967 | 1.79
9 | Tom | 1971 | 1.86
7 | Tom | 1978 | 1.92
5 | William | 1976 | 1.74
6 | William | 1981 | 1.82
(7 rows) -- !!!
没有确定性 ORDER BY
的 DISTINCT ON
查询可以 return 来自每组重复项的任意行。应用一次,您仍然可以获得确定的行数(任意选择)。重复应用,结果行数也是任意的。相关:
- Select first row in each GROUP BY group?
是否有可能 select 行是 DISTINCT ON
一些单独的、独立的列集?
假设我想要所有符合以下条件的行:
- 区别于
(name, birth)
- 区别于
(name, height)
因此,在以下 table 中,标有红叉的行将不会明显(指示失败的子句):
name birth height
--------------------------
William 1976 1.82
James 1981 1.68
Mike 1976 1.68
Tom 1967 1.79
William 1976 1.74 ❌ (name, birth)
William 1981 1.82 ❌ (name, height)
Tom 1978 1.92
Mike 1963 1.68 ❌ (name, height)
Tom 1971 1.86
James 1981 1.77 ❌ (name, birth)
Tom 1971 1.89 ❌ (name, birth)
在上面的例子中,如果 DISTINCT ON
子句刚好是 DISTINCT ON (name, birth, height)
,那么所有的行都会被认为是不同的。
尝试过但没有成功:
SELECT DISTINCT ON (name, birth) (name, height) ...
SELECT DISTINCT ON (name, birth), (name, height) ...
SELECT DISTINCT ON ((name, birth), (name, height)) ...
SELECT DISTINCT ON (name, birth) AND (name, height) ...
SELECT DISTINCT ON (name, birth) AND ON (name, height) ...
SELECT DISTINCT ON (name, birth) DISTINCT ON (name, height) ...
SELECT DISTINCT ON (name, birth), DISTINCT ON (name, height) ...
使用派生的 table:
with my_table(name, birth, height) as (
values
('William', 1976, 1.82),
('James', 1981, 1.68),
('Mike', 1976, 1.68),
('Tom', 1967, 1.79),
('William', 1976, 1.74), -- ? (name, birth)
('William', 1981, 1.82), -- ? (name, height)
('Tom', 1978, 1.92),
('Mike', 1963, 1.68), -- ? (name, height)
('Tom', 1971, 1.86),
('James', 1981, 1.77), -- ? (name, birth)
('Tom', 1971, 1.89) -- ? (name, birth)
)
select distinct on (name, height) *
from (
select distinct on (name, birth) *
from my_table
) s
name | birth | height
---------+-------+--------
James | 1981 | 1.68
Mike | 1963 | 1.68
Tom | 1967 | 1.79
Tom | 1971 | 1.89
Tom | 1978 | 1.92
William | 1976 | 1.82
(6 rows)
在 (name, birth)
上区分,首先选择最小的高度,然后选择最小的 ID 作为决胜局
或:
在 (name, height)
上区分,先选择最早的出生,然后选择最小的 ID 作为决胜局
您的 table 应该有一个主键(或 一些 唯一标识行的方法):
CREATE TEMP TABLE tbl (
<b>tbl_id serial PRIMARY KEY</b>
, name text
, birth int
, height numeric);
INSERT INTO tbl (name, birth, height)
VALUES
('William', 1976, 1.82)
, ('James', 1981, 1.68)
, ('Mike', 1976, 1.68)
, ('Tom', 1967, 1.79)
, ('William', 1976, 1.74)
, ('William', 1981, 1.82)
, ('Tom', 1978, 1.92)
, ('Mike', 1963, 1.68)
, ('Tom', 1971, 1.86)
, ('James', 1981, 1.77)
, ('Tom', 1971, 1.89);
查询:
SELECT DISTINCT ON (name, height) *
FROM (
SELECT DISTINCT ON (name, birth) *
FROM tbl
<b>ORDER BY name, birth, height, tbl_id</b> -- pick smallest height, ID as tiebreaker
) sub
<b>ORDER BY name, height, birth, tbl_id</b>; -- pick earliest birth, ID as tiebreaker
tbl_id | name | birth | height
--------+---------+-------+--------
2 | James | 1981 | 1.68
8 | Mike | 1963 | 1.68
4 | Tom | 1967 | 1.79
9 | Tom | 1971 | 1.86
7 | Tom | 1978 | 1.92
5 | William | 1976 | 1.74
6 | William | 1981 | 1.82
(7 rows) -- !!!
没有确定性 ORDER BY
的 DISTINCT ON
查询可以 return 来自每组重复项的任意行。应用一次,您仍然可以获得确定的行数(任意选择)。重复应用,结果行数也是任意的。相关:
- Select first row in each GROUP BY group?