查找在一列中具有相同值而在另一列中具有其他值的行?
Find rows that have same value in one column and other values in another column?
我有一个 PostgreSQL 数据库,它将用户存储在 users
table 中,他们参与的对话存储在 conversation
table 中。由于每个用户都可以参与多个对话,并且每个对话都可以涉及多个用户,因此我有一个 conversation_user
链接 table 来跟踪哪些用户参与了每个对话:
# conversation_user
id | conversation_id | user_id
----+------------------+--------
1 | 1 | 32
2 | 1 | 3
3 | 2 | 32
4 | 2 | 3
5 | 2 | 4
在上面的 table 中,用户 32 仅与用户 3 进行了一次对话,而另一次与用户 3 和用户 4 进行了一次对话。我将如何编写一个查询来显示只有用户之间存在对话32 和用户 3?
我试过以下方法:
SELECT conversation_id AS cid,
user_id
FROM conversation_user
GROUP BY cid HAVING count(*) = 2
AND (user_id = 32
OR user_id = 3);
SELECT conversation_id AS cid,
user_id
FROM conversation_user
GROUP BY (cid HAVING count(*) = 2
AND (user_id = 32
OR user_id = 3));
SELECT conversation_id AS cid,
user_id
FROM conversation_user
WHERE (user_id = 32)
OR (user_id = 3)
GROUP BY cid HAVING count(*) = 2;
这些查询抛出一个错误,表明 user_id 必须出现在 GROUP BY
子句中或用于聚合函数。将它们放在聚合函数中(例如 MIN
或 MAX
)听起来不合适。我认为我的前两次尝试是将它们放在 GROUP BY
子句中。
我做错了什么?
这是一个关系划分的案例。我们在这个相关问题下汇集了一系列技术:
- How to filter SQL results in a has-many-through relation
特别困难的是排除额外的用户。基本上有4个技巧。
- Select rows which are not present in other table
我建议 LEFT JOIN
/ IS NULL
:
SELECT cu1.conversation_id
FROM conversation_user cu1
JOIN conversation_user cu2 USING (conversation_id)
LEFT JOIN conversation_user cu3 ON cu3.conversation_id = cu1.conversation_id
AND cu3.user_id NOT IN (3,32)
WHERE cu1.user_id = 32
AND cu2.user_id = 3
AND cu3.conversation_id IS NULL;
或NOT EXISTS
:
SELECT cu1.conversation_id
FROM conversation_user cu1
JOIN conversation_user cu2 USING (conversation_id)
WHERE cu1.user_id = 32
AND cu2.user_id = 3
AND NOT EXISTS (
SELECT 1
FROM conversation_user cu3
WHERE cu3.conversation_id = cu1.conversation_id
AND cu3.user_id NOT IN (3,32)
);
这两个查询 不 取决于 (conversation_id, user_id)
的 UNIQUE
约束,该约束可能存在也可能不存在。这意味着,如果 user_id
32(或 3)在同一个对话中被多次列出,查询甚至会起作用。但是,您 会 在结果中得到重复的行,并且需要应用 DISTINCT
或 GROUP BY
.
唯一的条件是你制定的条件:
... a query that would show that there is a conversation between just user 32 and user 3?
审核查询
query you linked in the comment 行不通。您忘记排除其他参与者。应该是这样的:
SELECT * -- or whatever you want to return
FROM conversation_user cu1
WHERE cu1.user_id = 32
AND EXISTS (
SELECT 1
FROM conversation_user cu2
WHERE cu2.conversation_id = cu1.conversation_id
AND cu2.user_id = 3
)
AND NOT EXISTS (
SELECT 1
FROM conversation_user cu3
WHERE cu3.conversation_id = cu1.conversation_id
AND cu3.user_id NOT IN (3,32)
);
这与其他两个查询类似,只是如果 user_id = 3
被多次链接,它不会 return 多行。
您可以使用条件聚合来 select 所有只有 2 个特定参与者的 cid
select cid from conversation_user
group by cid
having count(*) = 2
and count(case when user_id not in (32,3) then 1 end) = 0
如果 (cid,user_id)
不唯一,则将 having count(*) = 2
替换为 having count(distinct user_id) = 2
因为您只想与 2 个用户进行对话,您可以对其他用户使用自外连接并过滤掉命中:
要查找所有 2 用户对话,并且它们介于:
SELECT
a.conversation_id cid,
a.user_id user_id_1,
b.user_id user_id_2
FROM conversation_user a
JOIN conversation_user b ON b.cid = a.cid
AND b.user_id > a.user_id
LEFT JOIN conversation_user c ON c.cid = a.cid
AND c.user_id NOT IN (a.user_id, b.user_id)
WHERE c.cid IS NULL -- only return misses on join to others
要查找特定用户的所有 2 用户对话,只需添加:
AND a.user_id = 32
如果你只是想确认。
select conversation_id
from conversation_users
group by conversation_id
having bool_and ( user_id in (3,32))
and count(*) = 2;
如果你想要完整的细节,
您可以使用 window 函数和 CTE,如下所示:
with a as (
select *
,not bool_and( user_id in (3,32) )
over ( partition by conversation_id)
and 2 = count(user_id)
over ( partition by conversation_id)
as conv_candidates
from conversation_users
)
select * from a where conv_candidates;
我有一个 PostgreSQL 数据库,它将用户存储在 users
table 中,他们参与的对话存储在 conversation
table 中。由于每个用户都可以参与多个对话,并且每个对话都可以涉及多个用户,因此我有一个 conversation_user
链接 table 来跟踪哪些用户参与了每个对话:
# conversation_user
id | conversation_id | user_id
----+------------------+--------
1 | 1 | 32
2 | 1 | 3
3 | 2 | 32
4 | 2 | 3
5 | 2 | 4
在上面的 table 中,用户 32 仅与用户 3 进行了一次对话,而另一次与用户 3 和用户 4 进行了一次对话。我将如何编写一个查询来显示只有用户之间存在对话32 和用户 3?
我试过以下方法:
SELECT conversation_id AS cid,
user_id
FROM conversation_user
GROUP BY cid HAVING count(*) = 2
AND (user_id = 32
OR user_id = 3);
SELECT conversation_id AS cid,
user_id
FROM conversation_user
GROUP BY (cid HAVING count(*) = 2
AND (user_id = 32
OR user_id = 3));
SELECT conversation_id AS cid,
user_id
FROM conversation_user
WHERE (user_id = 32)
OR (user_id = 3)
GROUP BY cid HAVING count(*) = 2;
这些查询抛出一个错误,表明 user_id 必须出现在 GROUP BY
子句中或用于聚合函数。将它们放在聚合函数中(例如 MIN
或 MAX
)听起来不合适。我认为我的前两次尝试是将它们放在 GROUP BY
子句中。
我做错了什么?
这是一个关系划分的案例。我们在这个相关问题下汇集了一系列技术:
- How to filter SQL results in a has-many-through relation
特别困难的是排除额外的用户。基本上有4个技巧。
- Select rows which are not present in other table
我建议 LEFT JOIN
/ IS NULL
:
SELECT cu1.conversation_id
FROM conversation_user cu1
JOIN conversation_user cu2 USING (conversation_id)
LEFT JOIN conversation_user cu3 ON cu3.conversation_id = cu1.conversation_id
AND cu3.user_id NOT IN (3,32)
WHERE cu1.user_id = 32
AND cu2.user_id = 3
AND cu3.conversation_id IS NULL;
或NOT EXISTS
:
SELECT cu1.conversation_id
FROM conversation_user cu1
JOIN conversation_user cu2 USING (conversation_id)
WHERE cu1.user_id = 32
AND cu2.user_id = 3
AND NOT EXISTS (
SELECT 1
FROM conversation_user cu3
WHERE cu3.conversation_id = cu1.conversation_id
AND cu3.user_id NOT IN (3,32)
);
这两个查询 不 取决于 (conversation_id, user_id)
的 UNIQUE
约束,该约束可能存在也可能不存在。这意味着,如果 user_id
32(或 3)在同一个对话中被多次列出,查询甚至会起作用。但是,您 会 在结果中得到重复的行,并且需要应用 DISTINCT
或 GROUP BY
.
唯一的条件是你制定的条件:
... a query that would show that there is a conversation between just user 32 and user 3?
审核查询
query you linked in the comment 行不通。您忘记排除其他参与者。应该是这样的:
SELECT * -- or whatever you want to return
FROM conversation_user cu1
WHERE cu1.user_id = 32
AND EXISTS (
SELECT 1
FROM conversation_user cu2
WHERE cu2.conversation_id = cu1.conversation_id
AND cu2.user_id = 3
)
AND NOT EXISTS (
SELECT 1
FROM conversation_user cu3
WHERE cu3.conversation_id = cu1.conversation_id
AND cu3.user_id NOT IN (3,32)
);
这与其他两个查询类似,只是如果 user_id = 3
被多次链接,它不会 return 多行。
您可以使用条件聚合来 select 所有只有 2 个特定参与者的 cid
select cid from conversation_user
group by cid
having count(*) = 2
and count(case when user_id not in (32,3) then 1 end) = 0
如果 (cid,user_id)
不唯一,则将 having count(*) = 2
替换为 having count(distinct user_id) = 2
因为您只想与 2 个用户进行对话,您可以对其他用户使用自外连接并过滤掉命中:
要查找所有 2 用户对话,并且它们介于:
SELECT
a.conversation_id cid,
a.user_id user_id_1,
b.user_id user_id_2
FROM conversation_user a
JOIN conversation_user b ON b.cid = a.cid
AND b.user_id > a.user_id
LEFT JOIN conversation_user c ON c.cid = a.cid
AND c.user_id NOT IN (a.user_id, b.user_id)
WHERE c.cid IS NULL -- only return misses on join to others
要查找特定用户的所有 2 用户对话,只需添加:
AND a.user_id = 32
如果你只是想确认。
select conversation_id
from conversation_users
group by conversation_id
having bool_and ( user_id in (3,32))
and count(*) = 2;
如果你想要完整的细节, 您可以使用 window 函数和 CTE,如下所示:
with a as (
select *
,not bool_and( user_id in (3,32) )
over ( partition by conversation_id)
and 2 = count(user_id)
over ( partition by conversation_id)
as conv_candidates
from conversation_users
)
select * from a where conv_candidates;