当多次连接子查询时,PostgreSQL 子查询 COUNT 失败

PostgreSQL subquery COUNT fails when the subquery is joined more than once

我有 2 个表:

Table class:

id serial4 PRIMARY KEY
name varchar(64)
code varchar(64)

Table class_event,我在其中存储与 classes 相关的事件,例如“开始”和“结束”。

id serial4
class_id int4 NOT NULL  // ->  FK to the class table
event_type varchar(1) NOT NULL  // -> 's' for started, 'e' for ended.

我需要查询每个 class 开始和结束的次数。这有效:

select
    c.code,
    c.name,
    count(started.id) "started"
from "class" c
left join (select id, class_id, event_type from "class_event" where event_type = 's') started 
    on started.klass_id = c.id
group by c.code, c.name
order by started desc;

但是当我做完全相同的事情来获得结束的数量时 classes 它显示不正确的数量:

select
    c.code,
    c.name,
    count(started.id) "started",
    count(ended.id) "ended"
from "class" c
left join (select id, class_id, event_type from "class_event" where event_type = 's') started 
    on started.klass_id = c.id
left join (select id, class_id, event_type from "class_event" where event_type = 'e') ended 
    on ended.klass_id = c.id
group by c.code, c.name
order by started desc;

此外,查询的执行时间要长得多。有什么我想念的吗?

你可以尝试使用条件聚合函数

select
    c.code,
    c.name,
    count(CASE WHEN event_type = 's' THEN ended.id END) "started",
    count(CASE WHEN event_type = 'e' THEN ended.id END) "ended"
from "class" c
left join "class_event" started 
    on started.class_id = c.id
group by c.code, c.name
order by started desc;

Is there anything I'm missing?

是的,多个联接乘以行。这与此处讨论的问题完全相同:

  • Two SQL LEFT JOINS produce incorrect result

当您查询整个 table 时,先聚合然后再加入通常更干净、更快速。参见:

  • Query with LEFT JOIN not returning rows for count of 0

这也原则上避免了原来的问题,即使是多重连接——我们不需要。

SELECT * 
FROM   class c
LEFT   JOIN (
   SELECT class_id AS id
        , count(*) FILTER (WHERE event_type = 's') AS started
        , count(*) FILTER (WHERE event_type = 'e') AS ended
   FROM   class_event
   GROUP  BY 1
   ) e  USING (id)
ORDER  BY e.started DESC NULLS LAST;

NULLS LAST 因为可以想象 类 中的一些在 table class_event 中没有相关行(还),并且结果 NULL 值肯定不应该排在最前面。参见:

  • Sort by column ASC, but NULL values first?

关于聚合 FILTER 子句:

旁白:

对于一手满满的允许值,我会考虑数据类型 "char" 而不是 event_typevarchar(1)。参见:

  • Any downsides of using data type "text" for storing strings?