如何在 none of function(a) in b 的情况下连接表

Question

在 MonetDB 中说（具体来说，来自“MonetDBLite”R 包的嵌入式版本）我有一个 table“事件”，其中包含实体 ID 代码和事件开始和结束日期，格式为：

| id  | start_date  | end_date   |
| 1   | 2010-01-01  | 2010-03-30 |
| 1   | 2010-04-01  | 2010-06-30 |
| 2   | 2018-04-01  | 2018-06-30 |
| ... | ...         | ...        |

table 大约有 8000 万行事件，table 归因于大约 250 万个唯一实体（ID 值）。日期似乎与日历季度很好地对齐，但我没有彻底检查它们，所以假设它们可以是任意的。但是，我至少对它们进行了感官检查 end_date > start_date.

我想生成一个 table“nonevent_qtrs”列表日历季度，其中 ID 没有事件记录，例如:

| id  | last_doq   |
| 1   | 2010-09-30 |
| 1   | 2010-12-31 |
| ... | ...        |
| 1   | 2018-06-30 |
| 2   | 2010-03-30 |
| ... | ...        |

(doq = 季度日)

如果事件的范围跨越该季度的任何几天（包括第一个和最后一个日期），那么我希望它算作在该季度发生。

为了解决这个问题，我制作了一个“日历 table”； table 个季度“qtrs”，涵盖“事件”中出现的整个日期范围，格式为：

| first_doq  | last_doq   |
| 2010-01-01 | 2010-03-30 |
| 2010-04-01 | 2010-06-30 |
| ...        | ...        |

并尝试像这样使用非等值合并：

create table nonevents
as select
    id,
    last_doq
from
    events
    full outer join
    qtrs
on
    start_date > last_doq or
    end_date < first_doq
group by
    id,
    last_doq

但这 a) 非常低效 b) 肯定是错误的，因为大多数 ID 被列为所有季度都不会发生大事。

如何生成我描述的 table“nonevent_qtrs”，其中包含每个 ID 没有事件的季度列表？

如果相关，最终用例是计算运行的非事件以查看事件发生时间分析和预测。感觉需要运行长度编码。如果有比我上面描述的更直接的方法，那么我会洗耳恭听。我首先关注非事件运行的唯一原因是试图限制叉积的大小。我也考虑过制作类似的东西：

| id  | last_doq   | event |
| 1   | 2010-01-31 | 1     |
| ... | ...        | ...   |
| 1   | 2018-06-30 | 0     |
| ... | ...        | ...   |

但是，虽然更有用，但由于涉及的数据量大，这可能不可行。宽幅：

| id  | 2010-01-31 | ... | 2018-06-30 |
| 1   | 1          | ... | 0          |
| 2   | 0          | ... | 1          |
| ... | ...        | ... | ...        |

也会很方便，但由于 MonetDB 是列式存储，我不确定这是否更有效。

Answer 1

让我假设您有一个 table 个季度，其中有一个季度的开始日期和结束日期。如果您想要不存在的宿舍，您真的需要这个。毕竟，您想回到多远或多远？

然后，您可以生成所有 id/quarter 组合并过滤掉存在的组合：

select i.id, q.*
from (select distinct id from events) i cross join
     quarters q left join
     events e
     on e.id = i.id and
        e.start_date <= q.quarter_end and
        e.end_date >= q.quarter_start
where e.id is null;

如何在 none of function(a) in b 的情况下连接表

how to join tables on cases where none of function(a) in b

sql

monetdb

monetdblite