获取最新的子消息以及没有子消息的父消息
Get the latest child messages and also parent messages that are childless
以下是Message模型
class Message < ApplicationRecord
belongs_to :parent_message, class_name: 'Message', optional: true
has_many :child_messages, foreign_key: :parent_message_id, class_name: "Message"
has_many :message_participants
scope :latest_messages_by_participant, -> (user_id) do
select("DISTINCT ON (parent_message_id) messages.*").
joins(:message_participants).
where(message_participants: { user_id: user_id }).
order("parent_message_id, created_at DESC")
end
end
message_participants
对每条消息以及发送或接收该消息的各种人都有记录。它上面有一个 user_id。
上述 latest_messages_by_participant
范围的问题是它能够获取所有子消息,但它只获取最后一条父消息。这是因为我们在 parent_message_id 上调用 DISINTICT ON,对于无子父消息,此值为 NULL,因此它只是在 NULL 上调用 distinct,在 1 个值(最后一个无子父消息)上调用 returns。
如何在单个查询中获取所有最新消息,包括最新的子消息和最新的无子父消息?
我正在使用 Rails 6 和 Postgres 11。
P.S: 我还应该指出一个次要问题,即消息在 created_at ASC 中返回。 created_at DESC 能够获取最新的子消息,但无法对整个集合进行排序。我可以通过调用 .reverse 来解决这个问题,但想知道是否也有办法解决这个问题。
我认为您需要在不同的 on & order by 中添加一个合并,以便在 parent_message_id
为空时选择消息的 id
。
select("DISTINCT ON (parent_message_id) messages.*")
...
order("parent_message_id, created_at DESC")
需要转换为
select("DISTINCT ON (COALESCE(parent_message_id, messages.id)) messages.*")
...
order("COALESCE(parent_message_id, messages.id), created_at DESC")
现在,您没有提供示例数据库 tables & expected 或完整的模型定义,所以我推断了很多事情。这是最小 table 定义(据我所知),在我建议的修改后由 AR 生成的原始 sql 查询 [这是我们想要的查询,给定下面的模式] &结果。
设置
CREATE TABLE messages (
id int primary key
, parent_message_id int references messages(id)
, created_at timestamp default current_timestamp
);
INSERT INTO messages (id, parent_message_id) values
(1, NULL) -- parent message with children
, (2, 1)
, (3, 1)
, (4, NULL) -- parent message without children
, (5, NULL) -- another parent message without children
;
CREATE TABLE message_participants (
user_id int
, message_id int references messages(id)
)
INSERT INTO message_participants values (1, 1), (2, 2), (3, 3), (1, 4), (2, 5);
RAW SQL 为我们提供最后一条父消息或子消息的查询:
SELECT DISTINCT ON (COALESCE(parent_message_id, messages.id)) messages.*
FROM messages
JOIN message_participants ON message_participants.message_id = messages.id
WHERE message_participants.user_id = ? -- replace by user_id
ORDER BY COALESCE(parent_message_id, messages.id), created_at DESC
结果
给定 user_id = 1
,上面的查询 returns 结果:
id | parent_message_id | created_at
----+-------------------+----------------------------
1 | | 2020-05-11 13:50:00.857589
4 | | 2020-05-11 13:50:00.857589
(2 rows)
给定 user_id = 2
,上面的查询 returns 结果:
id | parent_message_id | created_at
----+-------------------+----------------------------
2 | 1 | 2020-05-11 13:50:00.857589
5 | | 2020-05-11 13:52:01.261975
(2 rows)
对总体结果进行排序:
The created_at DESC is able to get the latest child message but does not sort the overall collection. I can solve this by calling .reverse, but wondering if there was a way to fix that as well.
要在数据库中进行排序,您可以将上述查询包装在一个 cte 中
示例:
WITH last_messages AS (
SELECT DISTINCT ON (COALESCE(parent_message_id, messages.id)) messages.*
FROM messages
JOIN message_participants ON message_participants.message_id = messages.id
WHERE message_participants.user_id = 2
ORDER BY COALESCE(parent_message_id, messages.id), created_at DESC
)
SELECT * FROM last_messages ORDER BY created_at;
但是,我不是 100% 确定在 AR 中如何表示
在 DISTINCT ON
和 ORDER BY
中使用 COALESCE
表达式。
并在外部查询中对结果进行排序以获得所需的排序顺序:
SELECT *
FROM (
SELECT DISTINCT ON (COALESCE(m.parent_message_id, m.id))
m.*
FROM messages m
JOIN message_participants mp ON ...
WHERE mp.user_id = ...
ORDER BY (COALESCE(m.parent_message_id, m.id)), created_at DESC
)
ORDER BY created_at;
参见(详细解释):
- SELECT DISTINCT ON, ordered by another column
- Select first row in each GROUP BY group?
性能?
对于每个用户和消息 ID 很少 行,DISTINCT ON
通常是最快的解决方案之一。对于 many 行,有(很多)更快的方法。取决于更多信息,如评论所述。
以下是Message模型
class Message < ApplicationRecord
belongs_to :parent_message, class_name: 'Message', optional: true
has_many :child_messages, foreign_key: :parent_message_id, class_name: "Message"
has_many :message_participants
scope :latest_messages_by_participant, -> (user_id) do
select("DISTINCT ON (parent_message_id) messages.*").
joins(:message_participants).
where(message_participants: { user_id: user_id }).
order("parent_message_id, created_at DESC")
end
end
message_participants
对每条消息以及发送或接收该消息的各种人都有记录。它上面有一个 user_id。
上述 latest_messages_by_participant
范围的问题是它能够获取所有子消息,但它只获取最后一条父消息。这是因为我们在 parent_message_id 上调用 DISINTICT ON,对于无子父消息,此值为 NULL,因此它只是在 NULL 上调用 distinct,在 1 个值(最后一个无子父消息)上调用 returns。
如何在单个查询中获取所有最新消息,包括最新的子消息和最新的无子父消息?
我正在使用 Rails 6 和 Postgres 11。
P.S: 我还应该指出一个次要问题,即消息在 created_at ASC 中返回。 created_at DESC 能够获取最新的子消息,但无法对整个集合进行排序。我可以通过调用 .reverse 来解决这个问题,但想知道是否也有办法解决这个问题。
我认为您需要在不同的 on & order by 中添加一个合并,以便在 parent_message_id
为空时选择消息的 id
。
select("DISTINCT ON (parent_message_id) messages.*")
...
order("parent_message_id, created_at DESC")
需要转换为
select("DISTINCT ON (COALESCE(parent_message_id, messages.id)) messages.*")
...
order("COALESCE(parent_message_id, messages.id), created_at DESC")
现在,您没有提供示例数据库 tables & expected 或完整的模型定义,所以我推断了很多事情。这是最小 table 定义(据我所知),在我建议的修改后由 AR 生成的原始 sql 查询 [这是我们想要的查询,给定下面的模式] &结果。
设置
CREATE TABLE messages (
id int primary key
, parent_message_id int references messages(id)
, created_at timestamp default current_timestamp
);
INSERT INTO messages (id, parent_message_id) values
(1, NULL) -- parent message with children
, (2, 1)
, (3, 1)
, (4, NULL) -- parent message without children
, (5, NULL) -- another parent message without children
;
CREATE TABLE message_participants (
user_id int
, message_id int references messages(id)
)
INSERT INTO message_participants values (1, 1), (2, 2), (3, 3), (1, 4), (2, 5);
RAW SQL 为我们提供最后一条父消息或子消息的查询:
SELECT DISTINCT ON (COALESCE(parent_message_id, messages.id)) messages.*
FROM messages
JOIN message_participants ON message_participants.message_id = messages.id
WHERE message_participants.user_id = ? -- replace by user_id
ORDER BY COALESCE(parent_message_id, messages.id), created_at DESC
结果
给定 user_id = 1
,上面的查询 returns 结果:
id | parent_message_id | created_at
----+-------------------+----------------------------
1 | | 2020-05-11 13:50:00.857589
4 | | 2020-05-11 13:50:00.857589
(2 rows)
给定 user_id = 2
,上面的查询 returns 结果:
id | parent_message_id | created_at
----+-------------------+----------------------------
2 | 1 | 2020-05-11 13:50:00.857589
5 | | 2020-05-11 13:52:01.261975
(2 rows)
对总体结果进行排序:
The created_at DESC is able to get the latest child message but does not sort the overall collection. I can solve this by calling .reverse, but wondering if there was a way to fix that as well.
要在数据库中进行排序,您可以将上述查询包装在一个 cte 中
示例:
WITH last_messages AS (
SELECT DISTINCT ON (COALESCE(parent_message_id, messages.id)) messages.*
FROM messages
JOIN message_participants ON message_participants.message_id = messages.id
WHERE message_participants.user_id = 2
ORDER BY COALESCE(parent_message_id, messages.id), created_at DESC
)
SELECT * FROM last_messages ORDER BY created_at;
但是,我不是 100% 确定在 AR 中如何表示
在 DISTINCT ON
和 ORDER BY
中使用 COALESCE
表达式。
并在外部查询中对结果进行排序以获得所需的排序顺序:
SELECT *
FROM (
SELECT DISTINCT ON (COALESCE(m.parent_message_id, m.id))
m.*
FROM messages m
JOIN message_participants mp ON ...
WHERE mp.user_id = ...
ORDER BY (COALESCE(m.parent_message_id, m.id)), created_at DESC
)
ORDER BY created_at;
参见(详细解释):
- SELECT DISTINCT ON, ordered by another column
- Select first row in each GROUP BY group?
性能?
对于每个用户和消息 ID 很少 行,DISTINCT ON
通常是最快的解决方案之一。对于 many 行,有(很多)更快的方法。取决于更多信息,如评论所述。