使用最近日期的子记录加快日期搜索(postgresql)
Speed up search for date with subrecords recent date (postgresql)
我想找到没有最近空缺的旧工作 activity。
表格如下:
CREATE TABLE job
(jobid int4, jobname text, jobdate date);
INSERT INTO job
(jobid, jobname, jobdate)
VALUES
(1,'X','2016-12-31'),
(2,'Y','2016-12-31'),
(3,'Z','2016-12-31');
CREATE TABLE hr
(hrid int4, hrjob int4, hrdate date);
INSERT INTO hr
(hrid, hrjob, hrdate)
VALUES
(1,1,'2017-05-30'),
(2,1,'2016-12-31'),
(3,2,'2016-12-31'),
(4,3,'2016-12-31'),
(5,4,'2017-12-31');
CREATE TABLE po
(poid int, pojob int4, podate date);
INSERT INTO po
(poid, pojob, podate)
VALUES
(1,1,'2016-05-30'),
(2,1,'2016-12-31'),
(3,2,'2016-12-31'),
(4,3,'2016-12-31'),
(5,4,'2017-12-31');
我找到了一个解决方案,它适用于很少的记录,但需要很长时间才能处理数千条记录
SELECT jobid
FROM job
LEFT JOIN hr ON hrjob=jobid
LEFT JOIN po ON poid=jobid
WHERE jobdate <'2017-12-31'
GROUP BY jobid
HAVING greatest(max(hrdate),max(podate))<'2017-12-31'
ORDER BY jobid
有什么方法可以简化和加速这个查询吗?
在这种情况下,除了 4 个工作之外的所有工作都可以关闭 = 没有最近的工作 activity。
SQLFiddle: http://sqlfiddle.com/#!15/098c3/1
执行计划:
GroupAggregate (cost=311.82..1199.60 rows=67 width=12)
Filter: (GREATEST(max(hr.hrdate), max(po.podate)) < '2017-12-31'::date)
-> Merge Left Join (cost=311.82..925.66 rows=36414 width=12)
Merge Cond: (job.jobid = po.poid)
-> Merge Left Join (cost=176.48..234.72 rows=3754 width=8)
Merge Cond: (job.jobid = hr.hrjob)
-> Sort (cost=41.13..42.10 rows=387 width=4)
Sort Key: job.jobid
-> Seq Scan on job (cost=0.00..24.50 rows=387 width=4)
Filter: (jobdate < '2017-12-31'::date)
-> Sort (cost=135.34..140.19 rows=1940 width=8)
Sort Key: hr.hrjob
-> Seq Scan on hr (cost=0.00..29.40 rows=1940 width=8)
-> Sort (cost=135.34..140.19 rows=1940 width=8)
Sort Key: po.poid
-> Seq Scan on po (cost=0.00..29.40 rows=1940 width=8)
解释:
Output: job.jobid
Filter: (GREATEST(max(hr.hrdate), max(po.podate)) < '2017-12-31'::date)
-> Merge Left Join (cost=311.82..925.66 rows=36414 width=12) (actual time=0.032..0.039 rows=4 loops=1)
Output: job.jobid, hr.hrdate, po.podate
Merge Cond: (job.jobid = po.poid -> Merge Left Join (cost=176.48..234.72 rows=3754 width=8) (actual time=0.024..0.028 rows=4 loops=1)
Output: job.jobid, hr.hrdate
Merge Cond: (job.jobid = hr.hrjob -> Sort (cost=41.13..42.10 rows=387 width=4) (actual time=0.014..0.015 rows=3 loops=1)
Output: job.jobid
Sort Key: job.jobid
Sort Method: quicksort Memory: 25kB -> Seq Scan on public.job (cost=0.00..24.50 rows=387 width=4) (actual time=0.006..0.007 rows=3 loops=1)
Output: job.jobid
Filter: (job.jobdate < '2017-12-31'::date) -> Sort (cost=135.34..140.19 rows=1940 width=8) (actual time=0.008..0.009 rows=5 loops=1)
Output: hr.hrdate, hr.hrjob
Sort Key: hr.hrjob
Sort Method: quicksort Memory: 25kB -> Seq Scan on public.hr (cost=0.00..29.40 rows=1940 width=8) (actual time=0.001..0.002 rows=5 loops=1)
Output: hr.hrdate, hr.hrjob -> Sort (cost=135.34..140.19 rows=1940 width=8) (actual time=0.007..0.007 rows=5 loops=1)
Output: po.podate, po.poid
Sort Key: po.poid
Sort Method: quicksort Memory: 25kB -> Seq Scan on public.po (cost=0.00..29.40 rows=1940 width=8) (actual time=0.001..0.003 rows=5 loops=1)
Output: po.podate, po.poid
Total runtime: 0.148 ms
提前致谢
可以为您节省大量潜在处理的方法是向作业添加一个字段,指示该作业已关闭。这可以为您节省大量查询工作!
不用 JOIN 和 GROUP BY,你可以找到这样的旧工作
SELECT jobid
FROM job
WHERE jobdate < '2017-12-31'
AND NOT EXISTS (SELECT 1
FROM hr
WHERE hr.hrjob = job.jobid
AND hrdate >= '2017-12-31')
AND NOT EXISTS (SELECT 1
FROM po
WHERE po.poid = job.jobid
AND podate >= '2017-12-31')
ORDER BY jobid
我认为它可以加快您的查询速度。
我想找到没有最近空缺的旧工作 activity。
表格如下:
CREATE TABLE job
(jobid int4, jobname text, jobdate date);
INSERT INTO job
(jobid, jobname, jobdate)
VALUES
(1,'X','2016-12-31'),
(2,'Y','2016-12-31'),
(3,'Z','2016-12-31');
CREATE TABLE hr
(hrid int4, hrjob int4, hrdate date);
INSERT INTO hr
(hrid, hrjob, hrdate)
VALUES
(1,1,'2017-05-30'),
(2,1,'2016-12-31'),
(3,2,'2016-12-31'),
(4,3,'2016-12-31'),
(5,4,'2017-12-31');
CREATE TABLE po
(poid int, pojob int4, podate date);
INSERT INTO po
(poid, pojob, podate)
VALUES
(1,1,'2016-05-30'),
(2,1,'2016-12-31'),
(3,2,'2016-12-31'),
(4,3,'2016-12-31'),
(5,4,'2017-12-31');
我找到了一个解决方案,它适用于很少的记录,但需要很长时间才能处理数千条记录
SELECT jobid
FROM job
LEFT JOIN hr ON hrjob=jobid
LEFT JOIN po ON poid=jobid
WHERE jobdate <'2017-12-31'
GROUP BY jobid
HAVING greatest(max(hrdate),max(podate))<'2017-12-31'
ORDER BY jobid
有什么方法可以简化和加速这个查询吗?
在这种情况下,除了 4 个工作之外的所有工作都可以关闭 = 没有最近的工作 activity。
SQLFiddle: http://sqlfiddle.com/#!15/098c3/1
执行计划:
GroupAggregate (cost=311.82..1199.60 rows=67 width=12)
Filter: (GREATEST(max(hr.hrdate), max(po.podate)) < '2017-12-31'::date)
-> Merge Left Join (cost=311.82..925.66 rows=36414 width=12)
Merge Cond: (job.jobid = po.poid)
-> Merge Left Join (cost=176.48..234.72 rows=3754 width=8)
Merge Cond: (job.jobid = hr.hrjob)
-> Sort (cost=41.13..42.10 rows=387 width=4)
Sort Key: job.jobid
-> Seq Scan on job (cost=0.00..24.50 rows=387 width=4)
Filter: (jobdate < '2017-12-31'::date)
-> Sort (cost=135.34..140.19 rows=1940 width=8)
Sort Key: hr.hrjob
-> Seq Scan on hr (cost=0.00..29.40 rows=1940 width=8)
-> Sort (cost=135.34..140.19 rows=1940 width=8)
Sort Key: po.poid
-> Seq Scan on po (cost=0.00..29.40 rows=1940 width=8)
解释:
Output: job.jobid
Filter: (GREATEST(max(hr.hrdate), max(po.podate)) < '2017-12-31'::date)
-> Merge Left Join (cost=311.82..925.66 rows=36414 width=12) (actual time=0.032..0.039 rows=4 loops=1)
Output: job.jobid, hr.hrdate, po.podate
Merge Cond: (job.jobid = po.poid -> Merge Left Join (cost=176.48..234.72 rows=3754 width=8) (actual time=0.024..0.028 rows=4 loops=1)
Output: job.jobid, hr.hrdate
Merge Cond: (job.jobid = hr.hrjob -> Sort (cost=41.13..42.10 rows=387 width=4) (actual time=0.014..0.015 rows=3 loops=1)
Output: job.jobid
Sort Key: job.jobid
Sort Method: quicksort Memory: 25kB -> Seq Scan on public.job (cost=0.00..24.50 rows=387 width=4) (actual time=0.006..0.007 rows=3 loops=1)
Output: job.jobid
Filter: (job.jobdate < '2017-12-31'::date) -> Sort (cost=135.34..140.19 rows=1940 width=8) (actual time=0.008..0.009 rows=5 loops=1)
Output: hr.hrdate, hr.hrjob
Sort Key: hr.hrjob
Sort Method: quicksort Memory: 25kB -> Seq Scan on public.hr (cost=0.00..29.40 rows=1940 width=8) (actual time=0.001..0.002 rows=5 loops=1)
Output: hr.hrdate, hr.hrjob -> Sort (cost=135.34..140.19 rows=1940 width=8) (actual time=0.007..0.007 rows=5 loops=1)
Output: po.podate, po.poid
Sort Key: po.poid
Sort Method: quicksort Memory: 25kB -> Seq Scan on public.po (cost=0.00..29.40 rows=1940 width=8) (actual time=0.001..0.003 rows=5 loops=1)
Output: po.podate, po.poid
Total runtime: 0.148 ms
提前致谢
可以为您节省大量潜在处理的方法是向作业添加一个字段,指示该作业已关闭。这可以为您节省大量查询工作!
不用 JOIN 和 GROUP BY,你可以找到这样的旧工作
SELECT jobid
FROM job
WHERE jobdate < '2017-12-31'
AND NOT EXISTS (SELECT 1
FROM hr
WHERE hr.hrjob = job.jobid
AND hrdate >= '2017-12-31')
AND NOT EXISTS (SELECT 1
FROM po
WHERE po.poid = job.jobid
AND podate >= '2017-12-31')
ORDER BY jobid
我认为它可以加快您的查询速度。