确定一个范围是否完全被一组范围覆盖
Determine if a range is completely covered by a set of ranges
我如何检查一个范围是否被一组范围 完全覆盖。在以下示例中:
WITH ranges(id, a, b) AS (
SELECT 1, 0, 40 UNION
SELECT 2, 40, 60 UNION
SELECT 3, 80, 100 UNION
SELECT 4, 10, 30
), tests(id, a, b) AS (
SELECT 1, 10, 90 UNION
SELECT 2, 10, 60
)
SELECT *
FROM tests
WHERE -- ?
- 我想 select
10, 60
因为 0, 40
和 40, 60
(和 10, 30
) 涵盖了所有内容
- 我想排除
10, 90
因为它暴露在60, 80
之间
假设 a
是包容性的而 b
是排他性的,即值 40
属于 [40, 60)
而不是 [0, 40)
。范围可以包含间隙和各种重叠。
实际问题涉及日期+时间数据,但日期只是数字。我正在使用 SQL 服务器,但首选通用解决方案。
这是解决方案的一般形式。想法是执行以下操作:
- 获取不能在范围内的所有点的列表。这是所有范围开始和范围结束。
- 检查它们是否不在一个范围内。
这是基于不在范围内的点将是 table:
中包含的数字之一的操作
with tc as (
select t.test, r.candidate
from tests t join
(select r.a as candidate from ranges union all
select r.b from ranges
) r
on r.candiate >= t.a and r.candidate < t.b
union all
select t.test, t.a
from tests t
union all
select t.test, t.b
from tests t
)
select distinct tc.test
from tc
where not exists (select 1
from ranges r
where tc.candidate >= r.a and tc.candidate < r.b
);
因为范围包括第一项,所以您实际上不必检查它。所以候选人名单可以减少:
with tc as (
select t.test, r.candidate
from tests t join
(select r.b as candidate from ranges
) r
on r.candidate >= t.a and r.r < t.b
union all
select t.test, t.a
from tests t
union all
select t.test, t.b
from tests t
)
您想要一个递归查询来查找实际范围(在您的情况下为 0 到 60 和 80 到 100)。我们将从给定的范围开始,并寻找扩展这些范围的范围。最后我们坚持最广泛的范围(例如,范围 10 到 30 可以扩展到 0 到 40,然后扩展到 0 到 60,所以我们保持最宽的范围 0 到 60)。
with wider_ranges(a, b, grp) as
(
select a, b, id from ranges
union all
select
case when r1.a < r2.a then r1.a else r2.a end,
case when r1.b > r2.b then r1.b else r2.b end,
r1.grp
from wider_ranges r1
join ranges r2 on (r2.a < r1.a and r2.b >= r1.a)
or (r2.b > r1.b and r2.a <= r1.b)
)
, real_ranges(a, b) as
(
select distinct min(a), max(b)
from wider_ranges
group by grp
)
select *
from tests
where exists
(
select *
from real_ranges
where tests.a >= real_ranges.a and tests.b <= real_ranges.b
);
Rextester 演示:http://rextester.com/BDJA16583
根据要求,这适用于 SQL 服务器,但它是标准的 SQL,因此它应该适用于几乎所有具有递归查询功能的 DBMS。
这是一个类似于 Thorsten 的递归解决方案。只是提供另一个例子。
WITH ranges(id, a, b) AS (
SELECT 1, 0, 40 UNION
SELECT 2, 40, 60 UNION
SELECT 3, 80, 100 UNION
SELECT 4, 10, 30
), tests(id, a, b) AS
(
SELECT 1 as id, 10 as a, 90 as b
UNION
SELECT 2, 10, 60
), rangeFinder(a, b, ra, rfb) AS
(
SELECT a, b, 0 AS ra, 0 AS rfb
FROM ranges AS r
UNION ALL
SELECT rangeFinder.a, ranges.b, ranges.a, rangeFinder.b
FROM ranges
JOIN rangeFinder
ON ranges.b > rangeFinder.b
AND ranges.a <=rangeFinder.b
), islands(a, b) AS
(
SELECT a, b
FROM rangeFinder
WHERE a NOT IN (SELECT ra FROM rangeFinder)
AND b NOT IN (SELECT rfb FROM rangeFinder)
)
SELECT t.id, t.a, t.b FROM
tests t
JOIN islands i
ON t.a >= i.a
AND t.b <= i.b
如已接受的答案中所述,解决方案是将重叠范围合并在一起,然后确定测试范围是否存在于合并范围之一内。
除了加入and/or递归,你还可以使用sorting approach和window函数来合并重叠范围:
WITH ranges(id, a, b) AS (
SELECT 1, 0, 40 UNION
SELECT 2, 40, 60 UNION
SELECT 3, 80, 100 UNION
SELECT 4, 10, 30
), tests(id, a, b) AS (
SELECT 1, 10, 90 UNION
SELECT 2, 10, 60
), ranges_chg AS (
SELECT *, CASE WHEN MAX(b) OVER (ORDER BY a ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) >= a THEN 0 ELSE 1 END AS chg
FROM ranges
), ranges_grp AS(
SELECT *, SUM(chg) OVER (ORDER BY a) AS grp
FROM ranges_chg
), merged_ranges AS (
SELECT MIN(a) AS a, MAX(b) AS b
FROM ranges_grp
GROUP BY grp
)
SELECT *
FROM tests
WHERE EXISTS (
SELECT 1
FROM merged_ranges
WHERE merged_ranges.a <= tests.a AND tests.b <= merged_ranges.b
)
结果和Fiddle.
| id | a | b |
|----|----|----|
| 2 | 10 | 60 |
range_grp
CTE 中的数据会让您了解它的工作原理:
| id | a | b | max b over... | chg | grp |
|----|----|-----|---------------|-----|-----|
| 1 | 0 | 40 | NULL | 1 | 1 |
| 4 | 10 | 30 | 40 | 0 | 1 |
| 2 | 40 | 60 | 40 | 0 | 1 |
| 3 | 80 | 100 | 60 | 1 | 2 |
我如何检查一个范围是否被一组范围 完全覆盖。在以下示例中:
WITH ranges(id, a, b) AS (
SELECT 1, 0, 40 UNION
SELECT 2, 40, 60 UNION
SELECT 3, 80, 100 UNION
SELECT 4, 10, 30
), tests(id, a, b) AS (
SELECT 1, 10, 90 UNION
SELECT 2, 10, 60
)
SELECT *
FROM tests
WHERE -- ?
- 我想 select
10, 60
因为0, 40
和40, 60
(和10, 30
) 涵盖了所有内容
- 我想排除
10, 90
因为它暴露在60, 80
之间
假设 a
是包容性的而 b
是排他性的,即值 40
属于 [40, 60)
而不是 [0, 40)
。范围可以包含间隙和各种重叠。
实际问题涉及日期+时间数据,但日期只是数字。我正在使用 SQL 服务器,但首选通用解决方案。
这是解决方案的一般形式。想法是执行以下操作:
- 获取不能在范围内的所有点的列表。这是所有范围开始和范围结束。
- 检查它们是否不在一个范围内。
这是基于不在范围内的点将是 table:
中包含的数字之一的操作with tc as (
select t.test, r.candidate
from tests t join
(select r.a as candidate from ranges union all
select r.b from ranges
) r
on r.candiate >= t.a and r.candidate < t.b
union all
select t.test, t.a
from tests t
union all
select t.test, t.b
from tests t
)
select distinct tc.test
from tc
where not exists (select 1
from ranges r
where tc.candidate >= r.a and tc.candidate < r.b
);
因为范围包括第一项,所以您实际上不必检查它。所以候选人名单可以减少:
with tc as (
select t.test, r.candidate
from tests t join
(select r.b as candidate from ranges
) r
on r.candidate >= t.a and r.r < t.b
union all
select t.test, t.a
from tests t
union all
select t.test, t.b
from tests t
)
您想要一个递归查询来查找实际范围(在您的情况下为 0 到 60 和 80 到 100)。我们将从给定的范围开始,并寻找扩展这些范围的范围。最后我们坚持最广泛的范围(例如,范围 10 到 30 可以扩展到 0 到 40,然后扩展到 0 到 60,所以我们保持最宽的范围 0 到 60)。
with wider_ranges(a, b, grp) as
(
select a, b, id from ranges
union all
select
case when r1.a < r2.a then r1.a else r2.a end,
case when r1.b > r2.b then r1.b else r2.b end,
r1.grp
from wider_ranges r1
join ranges r2 on (r2.a < r1.a and r2.b >= r1.a)
or (r2.b > r1.b and r2.a <= r1.b)
)
, real_ranges(a, b) as
(
select distinct min(a), max(b)
from wider_ranges
group by grp
)
select *
from tests
where exists
(
select *
from real_ranges
where tests.a >= real_ranges.a and tests.b <= real_ranges.b
);
Rextester 演示:http://rextester.com/BDJA16583
根据要求,这适用于 SQL 服务器,但它是标准的 SQL,因此它应该适用于几乎所有具有递归查询功能的 DBMS。
这是一个类似于 Thorsten 的递归解决方案。只是提供另一个例子。
WITH ranges(id, a, b) AS (
SELECT 1, 0, 40 UNION
SELECT 2, 40, 60 UNION
SELECT 3, 80, 100 UNION
SELECT 4, 10, 30
), tests(id, a, b) AS
(
SELECT 1 as id, 10 as a, 90 as b
UNION
SELECT 2, 10, 60
), rangeFinder(a, b, ra, rfb) AS
(
SELECT a, b, 0 AS ra, 0 AS rfb
FROM ranges AS r
UNION ALL
SELECT rangeFinder.a, ranges.b, ranges.a, rangeFinder.b
FROM ranges
JOIN rangeFinder
ON ranges.b > rangeFinder.b
AND ranges.a <=rangeFinder.b
), islands(a, b) AS
(
SELECT a, b
FROM rangeFinder
WHERE a NOT IN (SELECT ra FROM rangeFinder)
AND b NOT IN (SELECT rfb FROM rangeFinder)
)
SELECT t.id, t.a, t.b FROM
tests t
JOIN islands i
ON t.a >= i.a
AND t.b <= i.b
如已接受的答案中所述,解决方案是将重叠范围合并在一起,然后确定测试范围是否存在于合并范围之一内。
除了加入and/or递归,你还可以使用sorting approach和window函数来合并重叠范围:
WITH ranges(id, a, b) AS (
SELECT 1, 0, 40 UNION
SELECT 2, 40, 60 UNION
SELECT 3, 80, 100 UNION
SELECT 4, 10, 30
), tests(id, a, b) AS (
SELECT 1, 10, 90 UNION
SELECT 2, 10, 60
), ranges_chg AS (
SELECT *, CASE WHEN MAX(b) OVER (ORDER BY a ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) >= a THEN 0 ELSE 1 END AS chg
FROM ranges
), ranges_grp AS(
SELECT *, SUM(chg) OVER (ORDER BY a) AS grp
FROM ranges_chg
), merged_ranges AS (
SELECT MIN(a) AS a, MAX(b) AS b
FROM ranges_grp
GROUP BY grp
)
SELECT *
FROM tests
WHERE EXISTS (
SELECT 1
FROM merged_ranges
WHERE merged_ranges.a <= tests.a AND tests.b <= merged_ranges.b
)
结果和Fiddle.
| id | a | b |
|----|----|----|
| 2 | 10 | 60 |
range_grp
CTE 中的数据会让您了解它的工作原理:
| id | a | b | max b over... | chg | grp |
|----|----|-----|---------------|-----|-----|
| 1 | 0 | 40 | NULL | 1 | 1 |
| 4 | 10 | 30 | 40 | 0 | 1 |
| 2 | 40 | 60 | 40 | 0 | 1 |
| 3 | 80 | 100 | 60 | 1 | 2 |