PostgreSQL 按单个值过滤组
PostgreSQL filter group by individual values
我有一个查询 returns 数据如图所示;
name | field | count_1 | count_2 |
-----|-------|---------|---------|
John | aaa | 3 | 3 |
John | bbb | 3 | 3 |
John | ccc | 3 | 3 |
John | ddd | 1 | 1 |
Dave | aaa | 3 | 3 |
Dave | bbb | 3 | 3 |
Dave | ccc | 3 | 3 |
Dave | ddd | 3 | 3 |
-----|-------|---------|---------|
我需要根据 count_1
和 count_2
为 =3
的计数过滤此数据。在上述情况下,对于字段 ddd
上的 John,两个计数都不满足条件,因此无论 John
满足的其他条件如何,查询都应该 return Dave
其他领域。我怎样才能做到这一点?
只要某个人在给定字段上不符合单个计数,就应该将其过滤掉。
如果我没猜错,NOT EXISTS
可能会对您有所帮助。
SELECT *
FROM (<your query>) x
WHERE NOT EXISTS (SELECT *
FROM (<your query) y
WHERE y.name = x.name
AND (y.count_1 <> 3
OR y.count_2 <> 3));
将 <your query>
替换为您的查询,它会为您提供已发布的结果(或为此使用 CTE,但请注意,这可能会导致 Postgres 出现性能问题)。
也许有一个更优雅的解决方案,它已经 "short cuts" 到您的查询中,但要找到这样的解决方案需要有关您的架构和当前查询的更多信息。
我想你想要:
with t as (
<your query here>
)
select t.*
from (select t.*,
count(*) filter (where count_1 <> 3) over (partition by name) as cnt_1_3,
count(*) filter (where count_2 <> 3) over (partition by name) as cnt_2_3
from t
) t
where cnt_1_3 = 0 and cnt_2_3 = 0;
如果您不想要原始行,我会进行聚合:
select name
from t
group by name
having min(count_1) = max(count_1) and min(count_1) = 3 and
min(count_2) = max(count_2) and min(count_2) = 3;
或者您可以将其表述为:
having sum( (count_1 <> 3)::int ) = 0 and
sum( (count_2 <> 3)::int ) = 0
请注意,以上所有假设计数不是 NULL
(这对于称为计数的东西来说似乎是合理的)。如果 NULL
值是可能的,您可以使用 NULL
-安全比较 (is distinct from
)。
在 having 子句中使用布尔聚合 bool_and()
来获取满足条件的名称:
select name
from the_data
group by 1
having bool_and(count_1 = 3 and count_2 = 3)
name
------
Dave
(1 row)
您可以将以上内容用作子查询来过滤和 return 原始行(如果您需要):
select *
from the_data
where name in (
select name
from the_data
group by 1
having bool_and(count_1 = 3 and count_2 = 3)
)
name | field | count_1 | count_2
------+-------+---------+---------
Dave | aaa | 3 | 3
Dave | bbb | 3 | 3
Dave | ccc | 3 | 3
Dave | ddd | 3 | 3
(4 rows)
我有一个查询 returns 数据如图所示;
name | field | count_1 | count_2 |
-----|-------|---------|---------|
John | aaa | 3 | 3 |
John | bbb | 3 | 3 |
John | ccc | 3 | 3 |
John | ddd | 1 | 1 |
Dave | aaa | 3 | 3 |
Dave | bbb | 3 | 3 |
Dave | ccc | 3 | 3 |
Dave | ddd | 3 | 3 |
-----|-------|---------|---------|
我需要根据 count_1
和 count_2
为 =3
的计数过滤此数据。在上述情况下,对于字段 ddd
上的 John,两个计数都不满足条件,因此无论 John
满足的其他条件如何,查询都应该 return Dave
其他领域。我怎样才能做到这一点?
只要某个人在给定字段上不符合单个计数,就应该将其过滤掉。
如果我没猜错,NOT EXISTS
可能会对您有所帮助。
SELECT *
FROM (<your query>) x
WHERE NOT EXISTS (SELECT *
FROM (<your query) y
WHERE y.name = x.name
AND (y.count_1 <> 3
OR y.count_2 <> 3));
将 <your query>
替换为您的查询,它会为您提供已发布的结果(或为此使用 CTE,但请注意,这可能会导致 Postgres 出现性能问题)。
也许有一个更优雅的解决方案,它已经 "short cuts" 到您的查询中,但要找到这样的解决方案需要有关您的架构和当前查询的更多信息。
我想你想要:
with t as (
<your query here>
)
select t.*
from (select t.*,
count(*) filter (where count_1 <> 3) over (partition by name) as cnt_1_3,
count(*) filter (where count_2 <> 3) over (partition by name) as cnt_2_3
from t
) t
where cnt_1_3 = 0 and cnt_2_3 = 0;
如果您不想要原始行,我会进行聚合:
select name
from t
group by name
having min(count_1) = max(count_1) and min(count_1) = 3 and
min(count_2) = max(count_2) and min(count_2) = 3;
或者您可以将其表述为:
having sum( (count_1 <> 3)::int ) = 0 and
sum( (count_2 <> 3)::int ) = 0
请注意,以上所有假设计数不是 NULL
(这对于称为计数的东西来说似乎是合理的)。如果 NULL
值是可能的,您可以使用 NULL
-安全比较 (is distinct from
)。
在 having 子句中使用布尔聚合 bool_and()
来获取满足条件的名称:
select name
from the_data
group by 1
having bool_and(count_1 = 3 and count_2 = 3)
name
------
Dave
(1 row)
您可以将以上内容用作子查询来过滤和 return 原始行(如果您需要):
select *
from the_data
where name in (
select name
from the_data
group by 1
having bool_and(count_1 = 3 and count_2 = 3)
)
name | field | count_1 | count_2
------+-------+---------+---------
Dave | aaa | 3 | 3
Dave | bbb | 3 | 3
Dave | ccc | 3 | 3
Dave | ddd | 3 | 3
(4 rows)