在分组的 SQL 查询中，有效地丢弃在列中具有或不具有某些值的组

Question

我正在尝试构建一个查询，该查询聚合 table 中的记录，同时根据涉及其中一列中是否存在某些值的约束过滤组。这是一些示例数据：

CREATE TABLE test (
person_id smallint,
position_id smallint
);

INSERT INTO test 
VALUES (1, 30), (1, 99), (1, 98), (2, 98), (2, 99), (3, 30), (3, 28);

SELECT * FROM test;

+-----------+-------------+
| person_id | position_id |
+-----------+-------------+
|         1 |          30 |
|         1 |          99 |
|         1 |          98 |
|         2 |          98 |
|         2 |          99 |
|         3 |          30 |
|         3 |          28 |
+-----------+-------------+

我想将其汇总到 person_id，但仅适用于位置为 30 而没有位置 28 的人（例如）。正确的查询结果应该是：

+-----------+------------+
| person_id | positions  |
+-----------+------------+
|         1 | 30, 99, 98 |
+-----------+------------+

问题是，如何有效地做到这一点？我将要执行此操作的实际 table 更大。

我有两个工作查询得到了正确的结果：

SELECT person_id, Group_concat(position_id SEPARATOR ', ') AS positions
FROM test
GROUP BY person_id 
HAVING Sum(CASE WHEN position_id = 30 THEN 1 ELSE 0 END) > 0
AND Sum(CASE WHEN position_id = 28 THEN 1 ELSE 0 END) = 0;

SELECT person_id, Group_concat(position_id SEPARATOR ', ') AS positions
FROM test
GROUP BY person_id 
HAVING Max(position_id = 30) = 1
AND Max(position_id = 28) = 0;

但是，在我看来，没有必要像这些查询那样为每个组实际执行完整聚合（使用 Sum() 或 Max()），并且它使用逻辑 'any' 条件重新表述会更有效。例如

第一次遇到'30'position_id，满足第一个条件；
第一次遇到'28'position_id，我没有通过第二个条件；

之后的小组无需继续完成 position_id 的其余部分。但是，我不确定该怎么做，也许我在任何情况下都走错了路。

这是使用 MySQL 8.

Answer 1

您可以尝试使用子查询来确定此人是否符合您的限制条件。

SELECT person_id, Group_concat(position_id SEPARATOR ', ') AS positions
FROM test
WHERE 28 NOT IN (
  SELECT position_id
  FROM test AS ti
  WHERE ti.person_id = test.person_id
) AND 30 IN (
  SELECT position_id
  FROM test AS ti
  WHERE ti.person_id = test.person_id
)
GROUP BY person_id;

然而，只要您不分析查询执行计划，任何性能改进都只是猜测。

Answer 2

您可以尝试 EXISTS 和 NOT EXISTS

SELECT person_id, Group_concat(position_id SEPARATOR ', ') AS positions
from test t
WHERE EXISTS  ( SELECT person_id 
                     FROM test t1 
                     WHERE t.person_id=t1.person_id
                     AND t1.position_id=30 
                   ) 
AND  NOT EXISTS  (  SELECT person_id 
                     FROM test t2 
                     WHERE t.person_id=t2.person_id
                     AND t2.position_id=28  )
GROUP BY person_id ;

Result:

person_id positions
    1     30, 99, 98

Demo

在分组的 SQL 查询中，有效地丢弃在列中具有或不具有某些值的组

In a grouped SQL query, efficiently discard groups that have or do not have some values in a column

mysql

sql

group-by

having

any