如何获取不在 GROUP BY 中的列？

Question

我有一个 Postgresql 数据库，其中有这两个表。

shipping_method

id | name     | abbrev
---+----------+-------
1  | Standard | ST
2  | Express  | EX

shipping_details:

id	shipping_method_id	estimated_time_min	estimated_time_max	price
2	1	02:00:00	04:00:00	230
3	2	00:03:00	01:00:00	500
4	1	02:00:00	04:00:00	1230
5	1	02:00:00	04:00:00	850
6	2	01:00:00	02:00:00	1785

我的目标是获取每种运输方式中最昂贵的运输详细信息（对于特定产品 [不在 OP 中]）。

到目前为止，我写了这个查询：

SELECT 
    sm.id, sm.name, MAX(sd.price) AS max_price
FROM 
    shipping_details AS sd 
LEFT JOIN 
    shipping_method AS sm ON sm.id = sd.shipping_method_id
GROUP BY 
    sm.id

哪个returns:

id | name     | max_price
---+----------+---------
2  | Express  | 1785
1  | Standard | 1230

通过该查询，如果不将它们放在 GROUP BY 子句中，我无法获得 shipping_details 列。我主要需要价格较高的每种特定运输方式的运输详细信息。

我怎样才能做到这一点？

Answer 1

使用DISTINCT ON:

SELECT DISTINCT ON (sm.id) sm.id, sm.name, sd.price AS max_price
FROM shipping_details AS sd 
LEFT JOIN shipping_method AS sm
    ON sm.id = sd.shipping_method_id
ORDER BY sm.id, sd.price DESC;

以上逻辑将return具有最高价格的送货方式。

Answer 2

这是使用 window 函数的一种方法：

select *
from shipping_method sm 
join (
      select *, row_number() over (partition by shipping_method_id order by price desc) rn  
      from shipping_details sd) t 
on sd.shipping_method_id = t.id
and rn = 1 ;

Answer 3

要从 shipping_details 中价格最高的每一行中获取更多列，请使用 DISTINCT ON:

SELECT sm.id, sm.name, sd.*
FROM   shipping_method sm
LEFT   JOIN (
   SELECT DISTINCT ON (shipping_method_id)
          shipping_method_id AS id, price AS max_price
      --  add more columns as you like
   FROM   shipping_details sd
   ORDER  BY sd.shipping_method_id DESC, sd.price DESC, sd.id  -- ①
   ) sd USING (id);

只要涉及 shipping_details 中的所有行，通常先聚合最快，然后然后加入。（当 table 包含许多被连接消除的附加行时则不然。）

如果 price 可以 NULL 则 ORDER BY ... price DESC NULLS LAST - 否则 NULL 按降序排列在顶部。请务必匹配现有索引。

① 如果 table shipping_details 很大，shipping_details (shipping_method_id, prize) 上的索引会使其变快。或者 (shipping_method_id DESC, prize DESC) 上的索引。两列的 排序顺序与查询 同步很重要。 Postgres 可以向前或向后扫描索引，但对于多列索引，所有列的排序顺序需要与查询同步。参见：

Optimizing queries on a range of timestamps (two columns)

还有一个问题需要注意：如果可以有多个具有最高价格的运输详细信息，您会得到一个任意的选择，它可以随着每次执行而改变，通常是在写入相关行之后。要获得 stable、确定性结果，请添加更多 ORDER BY 表达式作为决胜局。就像我在上面附加的 sd.id 一样。那么最小的 id 就是赢家，一直如此。
如果有 many 这样的联系，甚至可以将 id 添加到索引中。喜欢 (shipping_method_id, prize, id DESC) - 注意 id!

的相反排序顺序

如何获取不在 GROUP BY 中的列？

How to get columns which are not in GROUP BY?

sql

database

postgresql

pgadmin