如何 select 一个日期范围 * 和 * 该范围之前和之后的条目?
How to select a date range *and* the entries immediately before and after that range?
我正在使用 table,其中每一行都有一个时间戳,并且该时间戳对于一组给定的其他列值是唯一的:
CREATE TEMPORARY TABLE time_series (
id SERIAL PRIMARY KEY,
created TIMESTAMP WITH TIME ZONE NOT NULL,
category TEXT,
value INT
);
CREATE UNIQUE INDEX ON time_series (created, category);
INSERT INTO time_series (created, category, value)
VALUES ('2000-01-01 00:00:00Z', 'foo', 1),
('2000-01-01 06:00:00Z', 'bar', 5),
('2000-01-01 12:00:00Z', 'bar', 5),
('2000-01-02 00:00:00Z', 'bar', 5),
('2000-01-02 12:34:45Z', 'bar', 2),
('2000-01-03 00:00:00Z', 'bar', 3),
('2000-01-04 00:00:00Z', 'bar', 3),
('2000-01-04 11:11:11Z', 'foo', 4),
('2000-01-04 22:22:22Z', 'bar', 5),
('2000-01-04 23:23:23Z', 'bar', 4),
('2000-01-05 00:00:00Z', 'foo', 1),
('2000-01-05 23:23:23Z', 'bar', 4);
时间戳不是均匀间隔的。给定一个任意的开始和结束日期时间,我的任务是获取这些日期时间 和该范围之前和之后的条目之间的条目。 基本上,我如何简化此查询:
(SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created < '2000-01-02 06:00:00Z'
ORDER BY created DESC
LIMIT 1)
UNION
(SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created >= '2000-01-02 06:00:00Z'
AND created < '2000-01-04 12:00:00Z')
UNION
(SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created >= '2000-01-04 12:00:00Z'
ORDER BY created
LIMIT 1)
ORDER BY created;
created value
2000-01-02 00:00:00+00 5
2000-01-02 12:34:45+00 2
2000-01-03 00:00:00+00 3
2000-01-04 00:00:00+00 3
2000-01-04 22:22:22+00 5
用例是获取数据点以显示图表:我知道图表左右边缘的日期时间,但它们通常不会与 created
日期时间完全对齐,所以在为了将图形一直显示到边缘,我需要一个数据点到范围的任一侧。
非解决方案:
- 我可以不能简单地select整个范围,因为它可能很大。
- 我不能 select 超出给定范围的任意长周期,因为该数据集可能再次很大,或者我 select 的任何周期都可能不足以获得下一个读数。
已编辑:
您可以将 UNION ALL
与 ORDER BY
和 LIMIT
以及一些子句边界结合起来。
像这样:
方法 1:
SELECT created,
value
FROM (SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created < '2000-01-02 06:00:00Z'
ORDER BY created ASC LIMIT 1
) AS ub
UNION ALL
SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created >= '2000-01-02 06:00:00Z'
AND created < '2000-01-04 12:00:00Z'
UNION ALL
SELECT created,
value
FROM (SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created >= '2000-01-04 12:00:00Z'
ORDER BY created DESC LIMIT 1
) AS lb
ORDER BY 1;
EXPLAIN ANALYZE
来自方法 1:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=3.60..3.61 rows=3 width=12) (actual time=0.228..0.237 rows=5 loops=1)
Sort Key: time_series.created
Sort Method: quicksort Memory: 25kB
-> HashAggregate (cost=3.55..3.58 rows=3 width=12) (actual time=0.182..0.195 rows=5 loops=1)
Group Key: time_series.created, time_series.value
-> Append (cost=1.16..3.53 rows=3 width=12) (actual time=0.073..0.163 rows=5 loops=1)
-> Limit (cost=1.16..1.16 rows=1 width=12) (actual time=0.070..0.073 rows=1 loops=1)
-> Sort (cost=1.16..1.16 rows=1 width=12) (actual time=0.065..0.067 rows=1 loops=1)
Sort Key: time_series.created DESC
Sort Method: quicksort Memory: 25kB
-> Seq Scan on time_series (cost=0.00..1.15 rows=1 width=12) (actual time=0.026..0.035 rows=2 loops=1)
Filter: ((created < '2000-01-02 06:00:00+00'::timestamp with time zone) AND (category = 'bar'::text))
Rows Removed by Filter: 8
-> Seq Scan on time_series time_series_1 (cost=0.00..1.18 rows=1 width=12) (actual time=0.007..0.016 rows=3 loops=1)
Filter: ((created >= '2000-01-02 06:00:00+00'::timestamp with time zone) AND (created < '2000-01-04 12:00:00+00'::timestamp with time zone) AND (category = 'bar'::text))
Rows Removed by Filter: 7
-> Limit (cost=1.16..1.16 rows=1 width=12) (actual time=0.051..0.054 rows=1 loops=1)
-> Sort (cost=1.16..1.16 rows=1 width=12) (actual time=0.047..0.049 rows=1 loops=1)
Sort Key: time_series_2.created
Sort Method: quicksort Memory: 25kB
-> Seq Scan on time_series time_series_2 (cost=0.00..1.15 rows=1 width=12) (actual time=0.009..0.016 rows=2 loops=1)
Filter: ((created >= '2000-01-04 12:00:00+00'::timestamp with time zone) AND (category = 'bar'::text))
Rows Removed by Filter: 8
Planning time: 0.388 ms
Execution time: 0.438 ms
(25 rows)
可以使用另一种类似的方法。
方法 2:
SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created >= (SELECT created
FROM time_series
WHERE category = 'bar'
AND created < '2000-01-02 06:00:00Z'
ORDER BY created ASC LIMIT 1)
AND created < (SELECT created
FROM time_series
WHERE category = 'bar'
AND created >= '2000-01-04 12:00:00Z'
ORDER BY created DESC LIMIT 1
)
EXPLAIN ANALYZE
来自方法 2:
--------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on time_series (cost=2.33..3.50 rows=1 width=12) (actual time=0.143..0.157 rows=6 loops=1)
Filter: ((created >= [=13=]) AND (created < ) AND (category = 'bar'::text))
Rows Removed by Filter: 4
InitPlan 1 (returns [=13=])
-> Limit (cost=1.16..1.16 rows=1 width=8) (actual time=0.066..0.069 rows=1 loops=1)
-> Sort (cost=1.16..1.16 rows=1 width=8) (actual time=0.061..0.062 rows=1 loops=1)
Sort Key: time_series_1.created
Sort Method: quicksort Memory: 25kB
-> Seq Scan on time_series time_series_1 (cost=0.00..1.15 rows=1 width=8) (actual time=0.008..0.015 rows=2 loops=1)
Filter: ((created < '2000-01-02 06:00:00+00'::timestamp with time zone) AND (category = 'bar'::text))
Rows Removed by Filter: 8
InitPlan 2 (returns )
-> Limit (cost=1.16..1.16 rows=1 width=8) (actual time=0.041..0.044 rows=1 loops=1)
-> Sort (cost=1.16..1.16 rows=1 width=8) (actual time=0.038..0.039 rows=1 loops=1)
Sort Key: time_series_2.created DESC
Sort Method: quicksort Memory: 25kB
-> Seq Scan on time_series time_series_2 (cost=0.00..1.15 rows=1 width=8) (actual time=0.007..0.013 rows=2 loops=1)
Filter: ((created >= '2000-01-04 12:00:00+00'::timestamp with time zone) AND (category = 'bar'::text))
Rows Removed by Filter: 8
Planning time: 0.392 ms
Execution time: 0.288 ms
当您使用限制时,查询将 运行 快速。
方法 3:
WITH a as (
SELECT created,
value,
lag(created, 1) OVER (ORDER BY created desc) AS ub,
lag(created, -1) OVER (ORDER BY created desc) AS lb
FROM time_series
WHERE category = 'bar'
) SELECT created,
value
FROM a
WHERE ub>='2000-01-02 06:00:00Z'
AND lb<'2000-01-04 12:00:00Z'
ORDER BY created
EXPLAIN ANALYZE
来自方法 3:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=1.19..1.20 rows=1 width=12) (actual time=0.174..0.181 rows=5 loops=
1)
Sort Key: a.created
Sort Method: quicksort Memory: 25kB
CTE a
-> WindowAgg (cost=1.14..1.16 rows=1 width=28) (actual time=0.075..0.107 rows=7 loops=1)
-> Sort (cost=1.14..1.14 rows=1 width=12) (actual time=0.056..0.067 rows=7 loops=1)
Sort Key: time_series.created DESC
Sort Method: quicksort Memory: 25kB
-> Seq Scan on time_series (cost=0.00..1.12 rows=1 width=12) (actual time=0.018..0.030 rows=7 loops=1)
Filter: (category = 'bar'::text)
Rows Removed by Filter: 3
-> CTE Scan on a (cost=0.00..0.03 rows=1 width=12) (actual time=0.088..0.131 rows=5 loops=1)
Filter: ((ub >= '2000-01-02 06:00:00+00'::timestamp with time zone) AND (lb < '2000-01-04 12:00:00+00'::timestamp with time zone))
Rows Removed by Filter: 2
Planning time: 0.175 ms
Execution time: 0.247 ms
(16 rows)
我正在使用 table,其中每一行都有一个时间戳,并且该时间戳对于一组给定的其他列值是唯一的:
CREATE TEMPORARY TABLE time_series (
id SERIAL PRIMARY KEY,
created TIMESTAMP WITH TIME ZONE NOT NULL,
category TEXT,
value INT
);
CREATE UNIQUE INDEX ON time_series (created, category);
INSERT INTO time_series (created, category, value)
VALUES ('2000-01-01 00:00:00Z', 'foo', 1),
('2000-01-01 06:00:00Z', 'bar', 5),
('2000-01-01 12:00:00Z', 'bar', 5),
('2000-01-02 00:00:00Z', 'bar', 5),
('2000-01-02 12:34:45Z', 'bar', 2),
('2000-01-03 00:00:00Z', 'bar', 3),
('2000-01-04 00:00:00Z', 'bar', 3),
('2000-01-04 11:11:11Z', 'foo', 4),
('2000-01-04 22:22:22Z', 'bar', 5),
('2000-01-04 23:23:23Z', 'bar', 4),
('2000-01-05 00:00:00Z', 'foo', 1),
('2000-01-05 23:23:23Z', 'bar', 4);
时间戳不是均匀间隔的。给定一个任意的开始和结束日期时间,我的任务是获取这些日期时间 和该范围之前和之后的条目之间的条目。 基本上,我如何简化此查询:
(SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created < '2000-01-02 06:00:00Z'
ORDER BY created DESC
LIMIT 1)
UNION
(SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created >= '2000-01-02 06:00:00Z'
AND created < '2000-01-04 12:00:00Z')
UNION
(SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created >= '2000-01-04 12:00:00Z'
ORDER BY created
LIMIT 1)
ORDER BY created;
created value
2000-01-02 00:00:00+00 5
2000-01-02 12:34:45+00 2
2000-01-03 00:00:00+00 3
2000-01-04 00:00:00+00 3
2000-01-04 22:22:22+00 5
用例是获取数据点以显示图表:我知道图表左右边缘的日期时间,但它们通常不会与 created
日期时间完全对齐,所以在为了将图形一直显示到边缘,我需要一个数据点到范围的任一侧。
非解决方案:
- 我可以不能简单地select整个范围,因为它可能很大。
- 我不能 select 超出给定范围的任意长周期,因为该数据集可能再次很大,或者我 select 的任何周期都可能不足以获得下一个读数。
已编辑:
您可以将 UNION ALL
与 ORDER BY
和 LIMIT
以及一些子句边界结合起来。
像这样:
方法 1:
SELECT created,
value
FROM (SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created < '2000-01-02 06:00:00Z'
ORDER BY created ASC LIMIT 1
) AS ub
UNION ALL
SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created >= '2000-01-02 06:00:00Z'
AND created < '2000-01-04 12:00:00Z'
UNION ALL
SELECT created,
value
FROM (SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created >= '2000-01-04 12:00:00Z'
ORDER BY created DESC LIMIT 1
) AS lb
ORDER BY 1;
EXPLAIN ANALYZE
来自方法 1:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=3.60..3.61 rows=3 width=12) (actual time=0.228..0.237 rows=5 loops=1)
Sort Key: time_series.created
Sort Method: quicksort Memory: 25kB
-> HashAggregate (cost=3.55..3.58 rows=3 width=12) (actual time=0.182..0.195 rows=5 loops=1)
Group Key: time_series.created, time_series.value
-> Append (cost=1.16..3.53 rows=3 width=12) (actual time=0.073..0.163 rows=5 loops=1)
-> Limit (cost=1.16..1.16 rows=1 width=12) (actual time=0.070..0.073 rows=1 loops=1)
-> Sort (cost=1.16..1.16 rows=1 width=12) (actual time=0.065..0.067 rows=1 loops=1)
Sort Key: time_series.created DESC
Sort Method: quicksort Memory: 25kB
-> Seq Scan on time_series (cost=0.00..1.15 rows=1 width=12) (actual time=0.026..0.035 rows=2 loops=1)
Filter: ((created < '2000-01-02 06:00:00+00'::timestamp with time zone) AND (category = 'bar'::text))
Rows Removed by Filter: 8
-> Seq Scan on time_series time_series_1 (cost=0.00..1.18 rows=1 width=12) (actual time=0.007..0.016 rows=3 loops=1)
Filter: ((created >= '2000-01-02 06:00:00+00'::timestamp with time zone) AND (created < '2000-01-04 12:00:00+00'::timestamp with time zone) AND (category = 'bar'::text))
Rows Removed by Filter: 7
-> Limit (cost=1.16..1.16 rows=1 width=12) (actual time=0.051..0.054 rows=1 loops=1)
-> Sort (cost=1.16..1.16 rows=1 width=12) (actual time=0.047..0.049 rows=1 loops=1)
Sort Key: time_series_2.created
Sort Method: quicksort Memory: 25kB
-> Seq Scan on time_series time_series_2 (cost=0.00..1.15 rows=1 width=12) (actual time=0.009..0.016 rows=2 loops=1)
Filter: ((created >= '2000-01-04 12:00:00+00'::timestamp with time zone) AND (category = 'bar'::text))
Rows Removed by Filter: 8
Planning time: 0.388 ms
Execution time: 0.438 ms
(25 rows)
可以使用另一种类似的方法。
方法 2:
SELECT created, value
FROM time_series
WHERE category = 'bar'
AND created >= (SELECT created
FROM time_series
WHERE category = 'bar'
AND created < '2000-01-02 06:00:00Z'
ORDER BY created ASC LIMIT 1)
AND created < (SELECT created
FROM time_series
WHERE category = 'bar'
AND created >= '2000-01-04 12:00:00Z'
ORDER BY created DESC LIMIT 1
)
EXPLAIN ANALYZE
来自方法 2:
--------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on time_series (cost=2.33..3.50 rows=1 width=12) (actual time=0.143..0.157 rows=6 loops=1)
Filter: ((created >= [=13=]) AND (created < ) AND (category = 'bar'::text))
Rows Removed by Filter: 4
InitPlan 1 (returns [=13=])
-> Limit (cost=1.16..1.16 rows=1 width=8) (actual time=0.066..0.069 rows=1 loops=1)
-> Sort (cost=1.16..1.16 rows=1 width=8) (actual time=0.061..0.062 rows=1 loops=1)
Sort Key: time_series_1.created
Sort Method: quicksort Memory: 25kB
-> Seq Scan on time_series time_series_1 (cost=0.00..1.15 rows=1 width=8) (actual time=0.008..0.015 rows=2 loops=1)
Filter: ((created < '2000-01-02 06:00:00+00'::timestamp with time zone) AND (category = 'bar'::text))
Rows Removed by Filter: 8
InitPlan 2 (returns )
-> Limit (cost=1.16..1.16 rows=1 width=8) (actual time=0.041..0.044 rows=1 loops=1)
-> Sort (cost=1.16..1.16 rows=1 width=8) (actual time=0.038..0.039 rows=1 loops=1)
Sort Key: time_series_2.created DESC
Sort Method: quicksort Memory: 25kB
-> Seq Scan on time_series time_series_2 (cost=0.00..1.15 rows=1 width=8) (actual time=0.007..0.013 rows=2 loops=1)
Filter: ((created >= '2000-01-04 12:00:00+00'::timestamp with time zone) AND (category = 'bar'::text))
Rows Removed by Filter: 8
Planning time: 0.392 ms
Execution time: 0.288 ms
当您使用限制时,查询将 运行 快速。
方法 3:
WITH a as (
SELECT created,
value,
lag(created, 1) OVER (ORDER BY created desc) AS ub,
lag(created, -1) OVER (ORDER BY created desc) AS lb
FROM time_series
WHERE category = 'bar'
) SELECT created,
value
FROM a
WHERE ub>='2000-01-02 06:00:00Z'
AND lb<'2000-01-04 12:00:00Z'
ORDER BY created
EXPLAIN ANALYZE
来自方法 3:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=1.19..1.20 rows=1 width=12) (actual time=0.174..0.181 rows=5 loops=
1)
Sort Key: a.created
Sort Method: quicksort Memory: 25kB
CTE a
-> WindowAgg (cost=1.14..1.16 rows=1 width=28) (actual time=0.075..0.107 rows=7 loops=1)
-> Sort (cost=1.14..1.14 rows=1 width=12) (actual time=0.056..0.067 rows=7 loops=1)
Sort Key: time_series.created DESC
Sort Method: quicksort Memory: 25kB
-> Seq Scan on time_series (cost=0.00..1.12 rows=1 width=12) (actual time=0.018..0.030 rows=7 loops=1)
Filter: (category = 'bar'::text)
Rows Removed by Filter: 3
-> CTE Scan on a (cost=0.00..0.03 rows=1 width=12) (actual time=0.088..0.131 rows=5 loops=1)
Filter: ((ub >= '2000-01-02 06:00:00+00'::timestamp with time zone) AND (lb < '2000-01-04 12:00:00+00'::timestamp with time zone))
Rows Removed by Filter: 2
Planning time: 0.175 ms
Execution time: 0.247 ms
(16 rows)