一个月的数据每小时差距

Hourly gaps in data for a month

大家好 Whosebugers

我真的很难为这个(在我看来)相对简单的寻找差距问题找到正确的方法。 我有一个带有每小时日期时间的 table(导入到数据库的每小时日志文件)。 我需要找到一段时间内缺失的时间(比如四月)。 所以想象一下在 DB table [imported_logs]

中有以下数据
[2018-04-02 10:00:000]
[2018-04-02 11:00:000]
[2018-04-02 12:00:000] 
[2018-04-02 17:00:000]

我需要四月份差距分析的结果是:

[      GAP-BEGIN     ]  [     GAB_END        ]
[2018-04-01 00:00:000]  [2018-04-02 10:00:000] <-- problem
[2018-04-02 13:00:000]  [2018-04-02 17:00:000] <-- can be found using below code
[2018-04-02 18:00:000]  [2018-05-01 00:00:000] <-- problem

我的问题主要是找到开始和结束范围,但以下代码有助于找到可用数据之间的上限。

    WITH t AS (
      SELECT  *, rn = ROW_NUMBER() OVER (PARTITION BY zone ORDER BY hourImported)
      FROM  logsImportedTable
      Where hourImported > '2018-04-01' and hourImported < '2018-05-01' and zone = 1
    )  
    SELECT  t1.zone, t1.hourImported as GapStart, t2.hourImported as GapEnd
    FROM    t t1
    INNER JOIN t t2 ON t2.zone = t1.zone AND t2.rn = t1.rn + 1
    WHERE   DATEDIFF(MINUTE, t1.hourImported, t2.hourImported) > 60

这只给我结果:

  [zone] [gap_start              ] [gap_end                ]
  [1   ] [2018-04-02 13:00:00.000] [2018-04-02 17:00:00.000]

所以基本上,如果在 4 月期间根本没有导入任何日志,那么当前的实施将不会显示任何丢失的数据(有点错误)

我在想我需要以某种方式在 4 月开始和结束之前添加一些新的数据点,以便以某种方式让查询捕捉到月份的开始和结束作为缺失数据? 你聪明 guys/girls 会做什么?

/亲切的问候

对于这种情况,只需添加初始值和结束值,如果合适的话:

<your query here>
union all
select '2018-04-01 00:00:000', min(lit.hourImported)
from logsImportedTable lit
where lit.hourImported >= '2018-04-01 00:00:00'
having min(lit.hourImported) > '2018-04-01 00:00:00'
union all
select '2018-05-01 00:00:000', max(lit.hourImported)
from logsImportedTable lit
where lit.hourImported <= '2018-05-01 00:00:00'
having max(lit.hourImported) > '2018-05-01 00:00:00';

好的,感谢@Gordon 的大力帮助,这是我对问题的最终解决方案。即使缺少整个月的数据并且其中的所有小间隙,它也会给出间隙。

DECLARE @zone INT = 1, @currentPeriodStart DATETIME = '2018-01-01', 
@currentPeriodEnd DATETIME = '2018-02-01';

WITH t AS (
SELECT  *, rn = ROW_NUMBER() over (PARTITION BY zone_id ORDER BY 
time_of_file_present)
FROM  test
Where time_of_file_present > @currentPeriodStart and time_of_file_present < 
@currentPeriodEnd and zone_id = @zone
)  
SELECT  t1.zone_id, t1.time_of_file_present as gap_start, 
t2.time_of_file_present as gap_end
FROM    t t1
    INNER JOIN t t2 ON t2.zone_id = t1.zone_id AND t2.rn = t1.rn + 1
WHERE   DATEDIFF(MINUTE, t1.time_of_file_present, t2.time_of_file_present) >60 

union all
select @zone, @currentPeriodStart, min(lit.time_of_file_present)
from test lit
where lit.time_of_file_present >=  @currentPeriodStart
having min(lit.time_of_file_present) >  @currentPeriodStart and 
min(lit.time_of_file_present) < @currentPeriodEnd

union all
select @zone,max(lit.time_of_file_present), @currentPeriodEnd
from test lit
where lit.time_of_file_present <= @currentPeriodEnd
having max(lit.time_of_file_present) < @currentPeriodEnd and 
max(lit.time_of_file_present) > @currentPeriodStart

union all
select @zone,@currentPeriodStart, @currentPeriodEnd
from test lit
having max(lit.time_of_file_present) < @currentPeriodStart or 
max(lit.time_of_file_present) > @currentPeriodEnd
order by gap_start