SQL 使用基于不同列的条件计数不同

SQL count distinct with a condition based on a different column

我有一个数据集,其中包含考试列表、qualifications/units 与之关联的考试以及考试是通过还是失败。数据看起来像这样:

candidate | qualification | unit | exam | exam_status
-----------------------------------------------------
 C1       | Q1            | U1   | E1   | Passed
 C1       | Q1            | U2   | E2   | NULL
 C1       | Q1            | U2   | E3   | Passed
 C1       | Q1            | U3   | E4   | Passed
 C1       | Q1            | U3   | E5   | Passed

据此,我需要能够计算每个资格的单元总数,以及候选人通过了这些单元的数量。

理论上每个单元应该有一次考试(虽然如果考生第一次考试失败可能有多个记录)所以我应该可以使用以下查询来获取我需要的数据:

select
  candidate,
  qualification,
  count(distinct unit),
  count(
    case when exam_status = 'Passed' then 1 else null end
  )
from example_table
group by candidate, qualification

但是,无论出于何种原因,有些考生多次通过同一考试,这意味着我通过的单元数有时会超过单元总数。

我想做这样的事情:

count(distinct exam case when exam_status = 'Passed' then 1 else null end)

仅 select 已通过但未通过的唯一考试。

有谁知道我怎样才能做到这一点?提前致谢。

您需要 exams 的不同计数,所以我认为是:

select candidate, qualification, 
       count(distinct units) as total_units,
       count(distinct case when exam_status = 'Passed' then exam end)
from example_table
group by candidate, qualification;

如果您想对已通过考试的单元求和,这会变得更加棘手。我会推荐 window 函数:

select candidate, qualification, count(distinct unit),
       sum(case when exam_status = 'Passed' and seqnum = 1 then unit end) as total_units,
       count(distinct case when exam_status = 'Passed' then exam end)
from (select et.*,
             row_number() over (partition by candidate, qualification, exam 
                                order by (case when exam_status = 'Passed' then 1 else 2 end)
                               ) as seqnum
      from example_table et
     ) et
where seqnum = 1
group by candidate, qualification;