使用最近日期的子记录加快日期搜索(postgresql)

Speed up search for date with subrecords recent date (postgresql)

我想找到没有最近空缺的旧工作 activity。

表格如下:

CREATE TABLE job
    (jobid int4, jobname text, jobdate date);    
INSERT INTO job
    (jobid, jobname, jobdate)
VALUES
    (1,'X','2016-12-31'),
    (2,'Y','2016-12-31'),
    (3,'Z','2016-12-31');

CREATE TABLE hr
   (hrid int4, hrjob int4, hrdate date);
INSERT INTO hr
   (hrid, hrjob, hrdate)
VALUES
    (1,1,'2017-05-30'),
    (2,1,'2016-12-31'),
    (3,2,'2016-12-31'),
    (4,3,'2016-12-31'),
    (5,4,'2017-12-31');

CREATE TABLE po
    (poid int, pojob int4, podate date);
INSERT INTO po
    (poid, pojob, podate)
VALUES
    (1,1,'2016-05-30'),
    (2,1,'2016-12-31'),
    (3,2,'2016-12-31'),
    (4,3,'2016-12-31'),
    (5,4,'2017-12-31');

我找到了一个解决方案,它适用于很少的记录,但需要很长时间才能处理数千条记录

SELECT    jobid 
FROM      job 
LEFT JOIN hr ON hrjob=jobid 
LEFT JOIN po ON poid=jobid
WHERE     jobdate <'2017-12-31'
GROUP BY  jobid
HAVING    greatest(max(hrdate),max(podate))<'2017-12-31' 
ORDER BY  jobid

有什么方法可以简化和加速这个查询吗?

在这种情况下,除了 4 个工作之外的所有工作都可以关闭 = 没有最近的工作 activity。

SQLFiddle: http://sqlfiddle.com/#!15/098c3/1

执行计划:

GroupAggregate (cost=311.82..1199.60 rows=67 width=12)
Filter: (GREATEST(max(hr.hrdate), max(po.podate)) < '2017-12-31'::date)
-> Merge Left Join (cost=311.82..925.66 rows=36414 width=12)
Merge Cond: (job.jobid = po.poid)
-> Merge Left Join (cost=176.48..234.72 rows=3754 width=8)
Merge Cond: (job.jobid = hr.hrjob)
-> Sort (cost=41.13..42.10 rows=387 width=4)
Sort Key: job.jobid
-> Seq Scan on job (cost=0.00..24.50 rows=387 width=4)
Filter: (jobdate < '2017-12-31'::date)
-> Sort (cost=135.34..140.19 rows=1940 width=8)
Sort Key: hr.hrjob
-> Seq Scan on hr (cost=0.00..29.40 rows=1940 width=8)
-> Sort (cost=135.34..140.19 rows=1940 width=8)
Sort Key: po.poid
-> Seq Scan on po (cost=0.00..29.40 rows=1940 width=8)

解释:

Output: job.jobid
Filter: (GREATEST(max(hr.hrdate), max(po.podate)) < '2017-12-31'::date)
-> Merge Left Join (cost=311.82..925.66 rows=36414 width=12) (actual time=0.032..0.039 rows=4 loops=1)
Output: job.jobid, hr.hrdate, po.podate
Merge Cond: (job.jobid = po.poid -> Merge Left Join (cost=176.48..234.72 rows=3754 width=8) (actual time=0.024..0.028 rows=4 loops=1)
Output: job.jobid, hr.hrdate
Merge Cond: (job.jobid = hr.hrjob -> Sort (cost=41.13..42.10 rows=387 width=4) (actual time=0.014..0.015 rows=3 loops=1)
Output: job.jobid
Sort Key: job.jobid
Sort Method: quicksort Memory: 25kB -> Seq Scan on public.job (cost=0.00..24.50 rows=387 width=4) (actual time=0.006..0.007 rows=3 loops=1)
Output: job.jobid
Filter: (job.jobdate < '2017-12-31'::date) -> Sort (cost=135.34..140.19 rows=1940 width=8) (actual time=0.008..0.009 rows=5 loops=1)
Output: hr.hrdate, hr.hrjob
Sort Key: hr.hrjob
Sort Method: quicksort Memory: 25kB -> Seq Scan on public.hr (cost=0.00..29.40 rows=1940 width=8) (actual time=0.001..0.002 rows=5 loops=1)
Output: hr.hrdate, hr.hrjob -> Sort (cost=135.34..140.19 rows=1940 width=8) (actual time=0.007..0.007 rows=5 loops=1)
Output: po.podate, po.poid
Sort Key: po.poid
Sort Method: quicksort Memory: 25kB -> Seq Scan on public.po (cost=0.00..29.40 rows=1940 width=8) (actual time=0.001..0.003 rows=5 loops=1)
Output: po.podate, po.poid
Total runtime: 0.148 ms

提前致谢

可以为您节省大量潜在处理的方法是向作业添加一个字段,指示该作业已关闭。这可以为您节省大量查询工作!

不用 JOIN 和 GROUP BY,你可以找到这样的旧工作

SELECT jobid 
FROM   job 
WHERE  jobdate < '2017-12-31' 
       AND NOT EXISTS (SELECT 1 
                       FROM   hr 
                       WHERE  hr.hrjob = job.jobid 
                              AND hrdate >= '2017-12-31') 
       AND NOT EXISTS (SELECT 1 
                       FROM   po 
                       WHERE  po.poid = job.jobid 
                              AND podate >= '2017-12-31') 
ORDER  BY jobid 

我认为它可以加快您的查询速度。