如何修复具有多个连接的计数的物化视图?
How to fix the materialized view on counts with multiple joins?
我正在 PostgreSQL 中使用物化视图优化查询,但查询逻辑在 mat 视图中不起作用
我想优化涉及多个连接的查询,它的执行时间也更长,所以我使用 PostgreSQL 在物化视图中尝试了相同的查询,但是当涉及到 mat 视图时,查询逻辑出错了。
我已经尝试在 PostgreSQL 11 中创建这个 mat 视图。
在下面的代码中,三个键 tables
1.posts
2.Topics
3.Post_topics.
Post_topics
tables 持有 post_id
和 topic_id
.
Topics
tables 是主题列表(每个主题都有多个值;
example if topic is egg, values associated with egg were
'breakfast','dinner','cheese' etc)
和 posts
tables 持有与主题相关的帖子。每个主题可能有多个帖子。
我想从 topic
table 中获取保存值的 count
。那么当主题 id 是鸡蛋时,早餐、晚餐、奶酪的数量是多少。主题 table 持有 8000 个,其中早餐、晚餐、奶酪也在值列表中。如果我输入奶酪,鸡蛋、早餐、奶酪的计数应该来了。在 Normal 查询中,我已经这样做了,但在 mat view 中,我正在努力获得这个逻辑。
原始查询:
SELECT t.id as topic_id, t.value as value, t.topic as topic, COUNT(c.topic_id) as count
FROM post_topics a
JOIN posts b ON a.post_id = b.id
JOIN post_topics c on a.post_id = c.post_id
--JOIN post_locations pl ON pl.post_id = c.post_id
JOIN topics t on c.topic_id = t.id --AND t.topic = 'cuisine'
WHERE a.topic_id = '1234547hnunsdrinfs'
AND t.id != '1234547hnunsdrinfs'
AND b.date_posted BETWEEN ('2019-06-17'::date - interval '6 month') AND '2019-06-17'::date
GROUP BY t.id, c.topic_id
ORDER BY count DESC
LIMIT 20
我已经使用主题 ID 列表编辑了垫子视图查询(主题 table 有 8000 个值)。在最初的查询中,我只提到了 1 个主题 ID,但我需要整个 8000 个的结果。对于下面的垫子视图,我得到了每个 ID 的热门主题,但计数不匹配。
**Edited Mat View
Create materialized view top_associations_mv AS
SELECT a.topic_id, t.id as topic_id, t.value as value, t.topic as topic, COUNT(c.topic_id) as count
FROM post_topics a
JOIN posts b ON a.post_id = b.id
JOIN post_topics c on a.post_id = c.post_id
JOIN topics t on c.topic_id = t.id
WHERE a.topic_id in ('c108200f-e4dc-415e-9150-3f6c74b879e2', '107f8cad-75b3-43fb-9b2f-f7914bf45155') -- here all 8000 topic id should be placed
--AND t.id <> ( 'c108200f-e4dc-415e-9150-3f6c74b879e2', '107f8cad-75b3-43fb-9b2f-f7914bf45155')--,'c348af9d-dd98-49f1-b6c2-8ea36b404ffa')
and (b.date_posted > (('now'::text)::date - '6 mons'::interval))
GROUP BY t.id, c.topic_id,a.topic_id
ORDER BY count DESC limit 10 ;
物化视图:
Create MATERIALIZED VIEW top_associations_mv as
SELECT t.id as topic_id, t.value as value, t.topic as topic, COUNT(c.topic_id) as count
FROM post_topics a
JOIN posts b ON a.post_id = b.id
JOIN post_topics c on a.post_id = c.post_id
JOIN topics t on c.topic_id = t.id
WHERE t.id != c.post_id and (b.date_posted > (('now'::text)::date - '6mons'::interval))
GROUP BY t.id, c.topic_id
ORDER BY count DESC
我的预期结果是:
我想从包含值的主题 table 中获取计数。那么当主题 id 是鸡蛋时,早餐、晚餐、奶酪的数量是多少。
但实际结果计数是错误的。
我真的需要帮助!!!
您应该在连接上使用子查询,并且子查询还应该包含 distinct on
子句。
你可以使用它:
create materialized view top_associations_mv as select
t.id as topic_id,
t.value as value,
t.topic as topic,
count(c.topic_id) as count
from
post_topics a
join (
select
distinct on
(id) id,
date_posted
from
posts ) as b on
a.post_id = b.id
and
(b.date_posted > (('now'::text)::date - '6mons'::interval)) (
select
distinct on
(post_id) topic_id
from
post_topics ) as c on
a.post_id = c.post_id (
select
distinct on
(id) id
from
topics ) as t on
c.topic_id = t.id
and t.id != c.post_id
group by
t.id,
c.topic_id
order by
count desc
我修正了查询。它只是另一个主题 table.
SELECT a.topic_id as main_topic, t.id as topicid, t.value as value, t.topic as topic, COUNT(c.topic_id) as count
FROM post_topics a
JOIN posts b ON a.post_id = b.id
JOIN post_topics c on a.post_id = c.post_id
JOIN topics t on c.topic_id = t.id
**JOIN topics t2 on t2.id = a.topic_id and (t.id) <> (t2.id)**
WHERE (b.date_posted > (('now'::text)::date - '6 mons'::interval))
AND LOWER(b.source) = 'instagram'
GROUP BY t.id, c.topic_id,a.topic_id
ORDER BY c.topic_id,count DESC;
我正在 PostgreSQL 中使用物化视图优化查询,但查询逻辑在 mat 视图中不起作用
我想优化涉及多个连接的查询,它的执行时间也更长,所以我使用 PostgreSQL 在物化视图中尝试了相同的查询,但是当涉及到 mat 视图时,查询逻辑出错了。
我已经尝试在 PostgreSQL 11 中创建这个 mat 视图。
在下面的代码中,三个键 tables
1.posts
2.Topics
3.Post_topics.
Post_topics
tables 持有 post_id
和 topic_id
.
Topics
tables 是主题列表(每个主题都有多个值;
example if topic is egg, values associated with egg were 'breakfast','dinner','cheese' etc)
和 posts
tables 持有与主题相关的帖子。每个主题可能有多个帖子。
我想从 topic
table 中获取保存值的 count
。那么当主题 id 是鸡蛋时,早餐、晚餐、奶酪的数量是多少。主题 table 持有 8000 个,其中早餐、晚餐、奶酪也在值列表中。如果我输入奶酪,鸡蛋、早餐、奶酪的计数应该来了。在 Normal 查询中,我已经这样做了,但在 mat view 中,我正在努力获得这个逻辑。
原始查询:
SELECT t.id as topic_id, t.value as value, t.topic as topic, COUNT(c.topic_id) as count
FROM post_topics a
JOIN posts b ON a.post_id = b.id
JOIN post_topics c on a.post_id = c.post_id
--JOIN post_locations pl ON pl.post_id = c.post_id
JOIN topics t on c.topic_id = t.id --AND t.topic = 'cuisine'
WHERE a.topic_id = '1234547hnunsdrinfs'
AND t.id != '1234547hnunsdrinfs'
AND b.date_posted BETWEEN ('2019-06-17'::date - interval '6 month') AND '2019-06-17'::date
GROUP BY t.id, c.topic_id
ORDER BY count DESC
LIMIT 20
我已经使用主题 ID 列表编辑了垫子视图查询(主题 table 有 8000 个值)。在最初的查询中,我只提到了 1 个主题 ID,但我需要整个 8000 个的结果。对于下面的垫子视图,我得到了每个 ID 的热门主题,但计数不匹配。
**Edited Mat View
Create materialized view top_associations_mv AS
SELECT a.topic_id, t.id as topic_id, t.value as value, t.topic as topic, COUNT(c.topic_id) as count
FROM post_topics a
JOIN posts b ON a.post_id = b.id
JOIN post_topics c on a.post_id = c.post_id
JOIN topics t on c.topic_id = t.id
WHERE a.topic_id in ('c108200f-e4dc-415e-9150-3f6c74b879e2', '107f8cad-75b3-43fb-9b2f-f7914bf45155') -- here all 8000 topic id should be placed
--AND t.id <> ( 'c108200f-e4dc-415e-9150-3f6c74b879e2', '107f8cad-75b3-43fb-9b2f-f7914bf45155')--,'c348af9d-dd98-49f1-b6c2-8ea36b404ffa')
and (b.date_posted > (('now'::text)::date - '6 mons'::interval))
GROUP BY t.id, c.topic_id,a.topic_id
ORDER BY count DESC limit 10 ;
物化视图:
Create MATERIALIZED VIEW top_associations_mv as
SELECT t.id as topic_id, t.value as value, t.topic as topic, COUNT(c.topic_id) as count
FROM post_topics a
JOIN posts b ON a.post_id = b.id
JOIN post_topics c on a.post_id = c.post_id
JOIN topics t on c.topic_id = t.id
WHERE t.id != c.post_id and (b.date_posted > (('now'::text)::date - '6mons'::interval))
GROUP BY t.id, c.topic_id
ORDER BY count DESC
我的预期结果是:
我想从包含值的主题 table 中获取计数。那么当主题 id 是鸡蛋时,早餐、晚餐、奶酪的数量是多少。 但实际结果计数是错误的。
我真的需要帮助!!!
您应该在连接上使用子查询,并且子查询还应该包含 distinct on
子句。
你可以使用它:
create materialized view top_associations_mv as select
t.id as topic_id,
t.value as value,
t.topic as topic,
count(c.topic_id) as count
from
post_topics a
join (
select
distinct on
(id) id,
date_posted
from
posts ) as b on
a.post_id = b.id
and
(b.date_posted > (('now'::text)::date - '6mons'::interval)) (
select
distinct on
(post_id) topic_id
from
post_topics ) as c on
a.post_id = c.post_id (
select
distinct on
(id) id
from
topics ) as t on
c.topic_id = t.id
and t.id != c.post_id
group by
t.id,
c.topic_id
order by
count desc
我修正了查询。它只是另一个主题 table.
SELECT a.topic_id as main_topic, t.id as topicid, t.value as value, t.topic as topic, COUNT(c.topic_id) as count
FROM post_topics a
JOIN posts b ON a.post_id = b.id
JOIN post_topics c on a.post_id = c.post_id
JOIN topics t on c.topic_id = t.id
**JOIN topics t2 on t2.id = a.topic_id and (t.id) <> (t2.id)**
WHERE (b.date_posted > (('now'::text)::date - '6 mons'::interval))
AND LOWER(b.source) = 'instagram'
GROUP BY t.id, c.topic_id,a.topic_id
ORDER BY c.topic_id,count DESC;