日期交集和 space 可用性

Question

我目前正在尝试检查一个日期范围内 "space" 的可用性，而这个日期范围可以无限长。表格如下：

Space:

id  available_spaces    name
1   20                  Space 1
2   40                  Space 2
3   10                  Space 3

预订（end_date可以为空，表示无限预订）：

id  space_id    start_date  end_date    spaces
1   1           13/12-2017  null        9
1   1           13/12-2017  13/12-2018  10

然后我希望能够进行搜索，例如：

from:   11/12-2016
to:     null (again meaning endless)
spaces: 2

此查询应 return 空格：Space 2，Space 3，因为它们在该时间间隔内都有足够的可用性。

并且通过将搜索中所需的空格数量从 2 改为 1 应该产生以下结果：搜索：

from:   11/12-2016
to:     null (again meaning endless)
spaces: 1

Space1，Space2，Space3。我发现很难解决的问题是每月可用的空间数量不定，以及无限预订的能力。

Answer 1

重访

一如既往，SQL 提供了多种方法来解决给定的任务。最初提出的解决方案（如下）使用自连接，但另一种方法是利用 window 函数。这个想法是在每次新预订开始时增加用完的space，并在结束时减少：

with bs as (
    select space_id as _sid
         , unnest(array[start_date,
                        coalesce(end_date, date 'infinity')]) as _d
         , unnest(array[spaces, -spaces]) as _sp
    from booking
    where end_date is null or end_date >= '2016-12-11'),
cs as (
    select _sid
        -- The inner sum collapses starting and ending bookings on the same
        -- date to a single spaces value, the outer is the running sum. This
        -- avoids the problem where the order of bookings starting or ending
        -- on the same date is unspecified and would produce possibly falsely
        -- high values for spaces, if all starting bookings just so happen to
        -- come first.
         , sum(sum(_sp)) over (partition by _sid
                               order by _d) as _usp
    from bs
    group by _sid, _d)
select *
from space
where not exists (
    select from cs
    where cs._sid = space.id
      and space.available_spaces - cs._usp < 2)

同Python/SQLAlchemy:

from sqlalchemy import or_
from sqlalchemy.dialects.postgresql import array

bs = session.query(
        Booking.space_id,
        func.unnest(array([
            Booking.start_date,
            func.coalesce(Booking.end_date, func.date('infinity'))
        ])).label('date'),
        func.unnest(array([Booking.spaces, -Booking.spaces])).label('spaces')).\
    filter(or_(Booking.end_date == None,
               Booking.end_date >= '2016-12-11')).\
    cte()

cs = session.query(bs.c.space_id,
                   func.sum(func.sum(bs.c.spaces)).over(
                       partition_by=bs.c.space_id,
                       order_by=bs.c.date).label('spaces')).\
    group_by(bs.c.space_id, bs.c.date).\
    cte()

query = session.query(Space).\
    filter(~session.query(cs).
           filter(cs.c.space_id == Space.id,
                  Space.available_spaces - cs.c.spaces < 2).
           exists())

首先使用 SQL 解释查询的工作原理，然后再构建 SQLAlchemy 会更容易。我假设预订和搜索总是有一个开始，或者换句话说，最终只能是无限的。使用 range types and operators，您应该首先查找与您的搜索重叠的预订。

select *
from booking
where daterange(start_date, end_date, '[)')
   && daterange('2016-12-11', null, '[)');

您需要从找到的预订中找到交叉点和使用的总和 space。要找到交叉点，请使用预订的开始并查找包含它的预订。对手头的所有预订重复此操作。例如：

|-------| 5
.  .  .
.  |-------------| 2
.  .  .
.  .  |-------------------- 3
.  .  .              .
.  .  .              |---| 1
.  .  .              .
5  7  10             4

查询形式：

with bs as (
    select *
    from booking
    where daterange(start_date, end_date, '[)')
       && daterange('2016-12-11', null, '[)')
)
select distinct
       b1.space_id,
       sum(b2.spaces) as sum
from bs b1
join bs b2
  on b1.start_date <@ daterange(b2.start_date, b2.end_date, '[)')
 and b1.space_id = b2.space_id
group by b1.id, b1.space_id;

根据您的示例数据得出

 space_id | sum 
----------+-----
        1 |  19
(1 row)

因为只有 2 个预订，而且它们的开始日期相同。该查询远非最佳，并且每个范围都必须扫描所有范围，因此至少 O(n^2) 。在程序设置中，您将使用 interval tree 或类似的查找，并且可能使用一些合适的索引和更改也可以改进 SQL。

有了相交的预订金额，您就可以检查不存在比搜索所需的金额少 space 的金额：

with bs as (
        select *
        from booking
        where daterange(start_date, end_date, '[)')
           && daterange('2016-12-11', null, '[)')
), cs as (
        select distinct
               b1.space_id,
               sum(b2.spaces) as sum
        from bs b1
        join bs b2
          on b1.start_date <@ daterange(b2.start_date, b2.end_date, '[)')
         and b1.space_id = b2.space_id
        -- Could also use distinct & sum() over (partition by b1.id) instead
        group by b1.id, b1.space_id
)
select *
from space
where not exists(
        select 1
        from cs
        where cs.space_id = space.id
              -- Check if there is not enough space
          and space.available_spaces - cs.sum < 2
);

由此可以直接形成SQL炼金术版本：

from functools import partial
from sqlalchemy.dialects.postgresql import DATERANGE

# Hack. Proper type for passing daterange values is
# psycopg2.extras.DateRange, but that does not have
# the comparator methods.
daterange = partial(func.daterange, type_=DATERANGE)

bs = session.query(Booking).\
    filter(daterange(Booking.start_date, Booking.end_date, '[)').
           overlaps(daterange('2016-12-11', None, '[)'))).\
    cte()

bs1 = bs.alias()
bs2 = bs.alias()

cs = session.query(bs1.c.space_id,
                   func.sum(bs2.c.spaces).label('sum')).\
    distinct().\
    join(bs2, (bs2.c.space_id == bs1.c.space_id) &
              daterange(bs2.c.start_date,
                        bs2.c.end_date).contains(bs1.c.start_date)).\
    group_by(bs1.c.id, bs1.c.space_id).\
    cte()

query = session.query(Space).\
    filter(~session.query(cs).
           filter(cs.c.space_id == Space.id,
                  Space.available_spaces - cs.c.sum < 2).
           exists())

日期交集和 space 可用性

Date intersection and space availability

python

postgresql

datetime

sqlalchemy

availability

重访