至少有一个对等方符合条件的行的条件计数

Conditional count of rows where at least one peer qualifies

背景

我是新手 SQL 用户。在本地 Windows 10 上使用 PostgreSQL 13,我有一个 table t:

+--+---------+-------+
|id|treatment|outcome|
+--+---------+-------+
|a |1        |0      |
|a |1        |1      |
|b |0        |1      |
|c |1        |0      |
|c |0        |1      |
|c |1        |1      |
+--+---------+-------+

问题

最初我没有很好地解释自己,所以我重写了目标。

想要的结果:

+-----------------------+-----+
|ever treated           |count|
+-----------------------+-----+
|0                      |1    |
|1                      |3    |
+-----------------------+-----+

首先,确定 id 曾接受过治疗。被“曾经对待”意味着与 treatment = 1.

有任何排

其次,为这两组中的每一组计算 outcome = 1 的行数。从我原来的table来看,“曾经治疗过”的id一共有3个outcome = 1,而“从未治疗过”的,可以说有1个`结果= 1.

我试过的

我想我可以通过这样的方式完成大部分工作:

select treatment, count(outcome)
from t
group by treatment;

但这只会让我得到这个结果:

+---------+-----+
|treatment|count|
+---------+-----+
|0        |2    |
|1        |4    |
+---------+-----+

更新后的问题:

SELECT ever_treated, sum(outcome_ct) AS count
FROM  (
   SELECT id
        , max(treatment) AS ever_treated
        , count(*) FILTER (WHERE outcome = 1) AS outcome_ct
   FROM   t
   GROUP  BY 1
   ) sub
GROUP  BY 1;
 ever_treated | count 
--------------+-------
            0 |     1
            1 |     3

db<>fiddle here

阅读:

  • 对于那些根本没有接受治疗的人(全部 treatment = 0),我们看到 1 x outcome = 1
  • 对于那些接受任何治疗的人(至少一次 treatment = 1),我们看到 3 x outcome = 1

使用适当的 boolean 值而不是 integer 会更简单、更快。

(更新问题的答案)

这是一个易于遵循的整数子查询逻辑:

    select subq.ever_treated, sum(subq.count) as count
    from (select id, max(treatment) as ever_treated, count(*) as count 
          from t where outcome = 1 
          group by id) as subq 
    group by subq.ever_treated;