使用 Postgres 聚合帖子及其所有标签

aggregating Posts with all their Hashtags using Postgres

我在 FB/instagram 中有类似 Posts 的内容。 当用户搜索某物时,输入查询用于查找内容中包含此查询的 Post,或者 - post 的一个或多个主题标签与此查询匹配。

CREATE TABLE posts (
    id int UNIQUE NOT NULL generated always as identity,
    content text,
    content_tokens tsvector,
    
    PRIMARY KEY (id)
);

CREATE TABLE tags (
    id int UNIQUE NOT NULL generated always as identity,
    tag_title text,
    
    PRIMARY KEY (id)
);

CREATE TABLE post_tags(
    post_id int,
    tag_id   int,
    
    PRIMARY KEY (post_id, tag_id),
    FOREIGN KEY (post_id) references posts (id) ON DELETE CASCADE,
    FOREIGN KEY (tag_id) references tags (id) ON DELETE CASCADE
);


-- query to find all "relevant" posts :

SELECT posts.id, posts.title, posts.description, tags.id as tag_id, tags.title as tag_title

FROM posts
JOIN post_tags
  ON posts.id = post_tags.post_id
JOIN tags 
  ON tags.id = post_tags.tag_id
  
WHERE tags.title ilike 'user_query'
   OR posts.content_tokens @@ plainto_tsquery('user_query:*');

问题:

我不知道如何让 Posts 返回一个包含所有相关标签的嵌套数组。 例如 Post -> {post_id: 123 , tags : [{id: 1, title: a}, {id: 2, title : b}, ...]}

此外,它会导致重复记录,因为我不能仅按一列进行 GROUP BY (post_id)。

示例:

posts 个表有:

id    content 
------------------------
1     this is a test
2     this is a test 2
3     this is user_query

tags 表有:

id   title
---------------
1    user_query
2    something
3    hey

post_tags 有:

post_id    tag_id 
------------------
1          1
1          2
2          2
3          2
3          3

所以在这个例子中,post_id = 1 的其中一个标签内有“user_query”,post_id = 3 内有“user_query”它的内容。所以,两者都应该归还。

[
  {"post_id" : 1, "tags" : [{"id" : 1, "title": "user_query"}, {"id" : 2, "title" : "something"}]}, 
  {"post_id" : 3, "tags" : [{"id" : 3, "title": "hey"}]}
]

是否要JSON聚合?

SELECT p.id, p.title, p.description, 
    JSONB_AGG(JSONB_BUILD_OBJECT('id', t.id, 'title', t.title)) as tags
FROM posts p
INNER JOIN post_tags pt ON p.id = pt.post_id
INNER JOIN tags t ON t.id = pt.tag_id
GROUP BY p.id