同一 table 上的多个连接乘以计数

Multiple joins on the same table multiply counts

当我 运行 在同一个 table 上进行多个连接时,第一个连接似乎是唯一通过的连接。

例如,我会得到这样的结果:

ID, NAME, 200, 200
ID, NAME, 150, 150
ID, NAME, 100, 100

显然票数与时间条目数明显不同。

select 
    contact.aid aid,
    (contact.data ->> 'FirstName') || ' ' || (contact.data ->> 'LastName') username,
    count(ticket) tickets,
    count(time) entries
from caches contact
inner join caches ticket
    on ticket.name = 'Ticket' and (ticket.data ->> 'CreatorResourceID')::numeric = contact.aid
inner join caches time
    on time.name = 'TimeEntry' and (time.data ->> 'TicketID')::numeric = ticket.aid
where 
    contact.name='Contact'
group by
    contact.aid,
    username
order by 
    tickets desc
;

我应该得到如下结果:

ID, NAME, 200, 421
ID, NAME, 150, 312
ID, NAME, 100, 152

我猜你是沿着两个不同的维度加入的,因此得到了错误的结果。

如果是这样,您可以使用count(distinct)。这是一个猜测,但也许:

count(distinct ticket) as tickets,
count(distinct time) as entries

你能看看这对你有用吗?这是我在单个 table.

中包含不同相关记录类型时使用的方法

如果第一个 case 列有错误,则将 numeric 转换为 aid 的任何类型(可能 intbigint).


select case name
         when 'Contact' then aid
         when 'Ticket' then (data->>'CreatorResourceID')::numeric
         when 'TimeEntry' then (data->>'TicketID')::numeric
       end as aid,
       max (
         case 
           when name = 'Contact' 
             then concat(
                    data->>'FirstName',
                    ' ',
                    data->>'LastName'
                  )
           else null
         end
       ) as username, 
       count(*) filter (where name = 'Ticket') as tickets,
       count(*) filter (where name = 'TimeEntry') as entries
  from contact
 group by aid
 order by tickets desc;
        

主要问题同这里:

  • Two SQL LEFT JOINS produce incorrect result

通过在 jsonb 列中嵌套值,您的情况会更加模糊,但都是一样的。

先聚合,后加入:

SELECT contact.aid
     , concat_ws(' ', contact.data->>'FirstName', contact.data->>'LastName') AS username
     , sum(ticket.tickets) AS tickets
     , sum(ticket.entries) AS entries
FROM   caches AS contact
CROSS  JOIN LATERAL (
   SELECT count(*)::int AS tickets
        , sum(entry.entries)::int AS entries
   FROM   caches AS ticket
   CROSS  JOIN LATERAL (
      SELECT count(*)::int AS entries
      FROM   caches AS entry
      WHERE  entry.name = 'TimeEntry' 
      AND   (entry.data ->> 'TicketID')::numeric = ticket.aid
      ) AS entry  -- was: "time"
   WHERE  ticket.name = 'Ticket'
   AND   (ticket.data ->> 'CreatorResourceID')::numeric = contact.aid  -- numeric?
   ) AS ticket
WHERE  contact.name = 'Contact'
GROUP  BY contact.aid, username
ORDER  BY ticket.tickets DESC;

假设 aid,或者至少 (aid, username) 在基础 table 中是唯一的,我们根本不需要外部聚合:

SELECT contact.aid
     , concat_ws(' ', contact.data->>'FirstName', contact.data->>'LastName') AS username
     , ticket.tickets
     , ticket.entries
FROM   caches AS contact
CROSS  JOIN LATERAL (
   SELECT count(*)::int AS tickets
        , sum(entry.entries)::int AS entries
   FROM   caches AS ticket
   CROSS  JOIN LATERAL (
      SELECT count(*)::int AS entries
      FROM   caches AS entry
      WHERE  entry.name = 'TimeEntry' 
      AND   (entry.data ->> 'TicketID')::numeric = ticket.aid
      ) AS entry  -- was: "time"
   WHERE  ticket.name = 'Ticket'
   AND   (ticket.data ->> 'CreatorResourceID')::numeric = contact.aid  -- numeric?
   ) AS ticket
WHERE  contact.name = 'Contact'
ORDER  BY ticket.tickets DESC;

它不仅避免了相乘计数的主要错误,而且通常还可以加快查询速度。

相关:

  • Multiple array_agg() calls in a single query

您的原始查询中有 INNER JOIN,可能应该是 LEFT JOIN ... ON true,以避免排除没有有效条目的用户。在我的解决方案中将它转换为 CROSS JOIN 是安全的,因为每个子查询级别都保证 return 恰好一行(聚合函数,而不是 GROUP BY)。参见:

关于 LATERAL 加入:

在子查询中转换为整数 (::int) 是可选的(并假设计数永远不会超出整数范围)。它避免了升级到 numeric,总结起来更昂贵。

为什么concat_ws()?参见:

  • How to concatenate columns in a Postgres SELECT?

data ->> 'TicketID'data ->> 'CreatorResourceID' 必须是 numeric 吗?看起来他们应该是 integer.

旁白:规范化您的数据模型(至少在某种程度上)可能对您的事业有所帮助。对嵌套在 jsonb 列中的数据值加入 tables 相对昂贵,通常可以提高效率。