判断一个table条目的时间范围是否与另一个KDB+/Q重叠

Determine whether the time range of one table entry overlaps with another KDB+/Q

我有一个 table 如下所示:

table:([] RIC:`A.N`A.N`A.N`GOOG.O`GOOG.O; 
startRange:2022.01.03D09:31:54.000000000 2022.01.03D09:32:04.000000000 2022.01.03D09:31:54.100000000 2022.01.03D09:31:54.000000000 2022.01.03D09:31:54.100000000;
endRange:2022.01.03D09:31:59.000000000 2022.01.03D09:32:09.000000000 2022.01.03D09:31:59.100000000 2022.01.03D09:31:59.000000000 2022.01.03D09:31:59.100000000)

我想添加一个“重叠”列,它是一个布尔标志,每当一个条目有另一个与其时间范围重叠的条目(具有相同的对应 RIC)时,它就等于 1。 因此,对于我上面的 table,应该标记第一个和第三个条目,因为它们都用于 `A.N 并且具有重叠的日期时间范围。应该标记第四个和第五个条目,因为它们也具有相同的 RIC 和重叠的日期时间范围。

老实说,我什至不知道如何处理这个问题。任何建议将不胜感激!

这里有点 long-winded 解决方案,但我认为这涵盖了您的用例:

q)raze{update overlap:{any(x within'z)|y within'z}'[startRange;endRange]{x where y<>til count x}'[;i]count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O
RIC    startRange                    endRange                      overlap
--------------------------------------------------------------------------
A.N    2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000 1
A.N    2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000 0
A.N    2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000 1
GOOG.O 2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000 1
GOOG.O 2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000 1

为了分解这个答案,首先我们需要时间范围来检查重叠。我们从给定 RIC 的所有时间范围开始:

q)`overlap xcols raze{update overlap:count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O
overlap                                                                                                                                                                          ..
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000)                                                        ..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000)                                                        ..

我们要排除我们正在处理的条目的时间范围:

q)`overlap xcols raze{update overlap:{x where y<>til count x}'[;i]count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O
overlap                                                                                                                   RIC    startRange                    endRange          ..
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------..
(2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000) A.N    2022.01.03D09:31:54.000000000 2022.01.03D09:31:5..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000) A.N    2022.01.03D09:32:04.000000000 2022.01.03D09:32:0..
(2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000;2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000) A.N    2022.01.03D09:31:54.100000000 2022.01.03D09:31:5..
,2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000                                                              GOOG.O 2022.01.03D09:31:54.000000000 2022.01.03D09:31:5..
,2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000                                                              GOOG.O 2022.01.03D09:31:54.100000000 2022.01.03D09:31:5..

最后看看 startRangeendRange 是否在这些时间范围内:

q)`overlap xcols raze{update overlap:{any(x within'z)|y within'z}'[startRange;endRange]{x where y<>til count x}'[;i]count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O
overlap RIC    startRange                    endRange
--------------------------------------------------------------------------
1       A.N    2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000
0       A.N    2022.01.03D09:32:04.000000000 2022.01.03D09:32:09.000000000
1       A.N    2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000
1       GOOG.O 2022.01.03D09:31:54.000000000 2022.01.03D09:31:59.000000000
1       GOOG.O 2022.01.03D09:31:54.100000000 2022.01.03D09:31:59.100000000

编辑:

更快的解决方案

q)raze{update overlap:{$[any x within'z;1b;any y within'z]}'[startRange;endRange]{x where y<>til count x}'[;i]count[i]#enlist flip(startRange;endRange)from x}each{select from table where RIC=x}each`A.N`GOOG.O