一个月的数据每小时差距
Hourly gaps in data for a month
大家好 Whosebugers
我真的很难为这个(在我看来)相对简单的寻找差距问题找到正确的方法。
我有一个带有每小时日期时间的 table(导入到数据库的每小时日志文件)。
我需要找到一段时间内缺失的时间(比如四月)。
所以想象一下在 DB table [imported_logs]
中有以下数据
[2018-04-02 10:00:000]
[2018-04-02 11:00:000]
[2018-04-02 12:00:000]
[2018-04-02 17:00:000]
我需要四月份差距分析的结果是:
[ GAP-BEGIN ] [ GAB_END ]
[2018-04-01 00:00:000] [2018-04-02 10:00:000] <-- problem
[2018-04-02 13:00:000] [2018-04-02 17:00:000] <-- can be found using below code
[2018-04-02 18:00:000] [2018-05-01 00:00:000] <-- problem
我的问题主要是找到开始和结束范围,但以下代码有助于找到可用数据之间的上限。
WITH t AS (
SELECT *, rn = ROW_NUMBER() OVER (PARTITION BY zone ORDER BY hourImported)
FROM logsImportedTable
Where hourImported > '2018-04-01' and hourImported < '2018-05-01' and zone = 1
)
SELECT t1.zone, t1.hourImported as GapStart, t2.hourImported as GapEnd
FROM t t1
INNER JOIN t t2 ON t2.zone = t1.zone AND t2.rn = t1.rn + 1
WHERE DATEDIFF(MINUTE, t1.hourImported, t2.hourImported) > 60
这只给我结果:
[zone] [gap_start ] [gap_end ]
[1 ] [2018-04-02 13:00:00.000] [2018-04-02 17:00:00.000]
所以基本上,如果在 4 月期间根本没有导入任何日志,那么当前的实施将不会显示任何丢失的数据(有点错误)
我在想我需要以某种方式在 4 月开始和结束之前添加一些新的数据点,以便以某种方式让查询捕捉到月份的开始和结束作为缺失数据?
你聪明 guys/girls 会做什么?
/亲切的问候
对于这种情况,只需添加初始值和结束值,如果合适的话:
<your query here>
union all
select '2018-04-01 00:00:000', min(lit.hourImported)
from logsImportedTable lit
where lit.hourImported >= '2018-04-01 00:00:00'
having min(lit.hourImported) > '2018-04-01 00:00:00'
union all
select '2018-05-01 00:00:000', max(lit.hourImported)
from logsImportedTable lit
where lit.hourImported <= '2018-05-01 00:00:00'
having max(lit.hourImported) > '2018-05-01 00:00:00';
好的,感谢@Gordon 的大力帮助,这是我对问题的最终解决方案。即使缺少整个月的数据并且其中的所有小间隙,它也会给出间隙。
DECLARE @zone INT = 1, @currentPeriodStart DATETIME = '2018-01-01',
@currentPeriodEnd DATETIME = '2018-02-01';
WITH t AS (
SELECT *, rn = ROW_NUMBER() over (PARTITION BY zone_id ORDER BY
time_of_file_present)
FROM test
Where time_of_file_present > @currentPeriodStart and time_of_file_present <
@currentPeriodEnd and zone_id = @zone
)
SELECT t1.zone_id, t1.time_of_file_present as gap_start,
t2.time_of_file_present as gap_end
FROM t t1
INNER JOIN t t2 ON t2.zone_id = t1.zone_id AND t2.rn = t1.rn + 1
WHERE DATEDIFF(MINUTE, t1.time_of_file_present, t2.time_of_file_present) >60
union all
select @zone, @currentPeriodStart, min(lit.time_of_file_present)
from test lit
where lit.time_of_file_present >= @currentPeriodStart
having min(lit.time_of_file_present) > @currentPeriodStart and
min(lit.time_of_file_present) < @currentPeriodEnd
union all
select @zone,max(lit.time_of_file_present), @currentPeriodEnd
from test lit
where lit.time_of_file_present <= @currentPeriodEnd
having max(lit.time_of_file_present) < @currentPeriodEnd and
max(lit.time_of_file_present) > @currentPeriodStart
union all
select @zone,@currentPeriodStart, @currentPeriodEnd
from test lit
having max(lit.time_of_file_present) < @currentPeriodStart or
max(lit.time_of_file_present) > @currentPeriodEnd
order by gap_start
大家好 Whosebugers
我真的很难为这个(在我看来)相对简单的寻找差距问题找到正确的方法。 我有一个带有每小时日期时间的 table(导入到数据库的每小时日志文件)。 我需要找到一段时间内缺失的时间(比如四月)。 所以想象一下在 DB table [imported_logs]
中有以下数据[2018-04-02 10:00:000]
[2018-04-02 11:00:000]
[2018-04-02 12:00:000]
[2018-04-02 17:00:000]
我需要四月份差距分析的结果是:
[ GAP-BEGIN ] [ GAB_END ]
[2018-04-01 00:00:000] [2018-04-02 10:00:000] <-- problem
[2018-04-02 13:00:000] [2018-04-02 17:00:000] <-- can be found using below code
[2018-04-02 18:00:000] [2018-05-01 00:00:000] <-- problem
我的问题主要是找到开始和结束范围,但以下代码有助于找到可用数据之间的上限。
WITH t AS (
SELECT *, rn = ROW_NUMBER() OVER (PARTITION BY zone ORDER BY hourImported)
FROM logsImportedTable
Where hourImported > '2018-04-01' and hourImported < '2018-05-01' and zone = 1
)
SELECT t1.zone, t1.hourImported as GapStart, t2.hourImported as GapEnd
FROM t t1
INNER JOIN t t2 ON t2.zone = t1.zone AND t2.rn = t1.rn + 1
WHERE DATEDIFF(MINUTE, t1.hourImported, t2.hourImported) > 60
这只给我结果:
[zone] [gap_start ] [gap_end ]
[1 ] [2018-04-02 13:00:00.000] [2018-04-02 17:00:00.000]
所以基本上,如果在 4 月期间根本没有导入任何日志,那么当前的实施将不会显示任何丢失的数据(有点错误)
我在想我需要以某种方式在 4 月开始和结束之前添加一些新的数据点,以便以某种方式让查询捕捉到月份的开始和结束作为缺失数据? 你聪明 guys/girls 会做什么?
/亲切的问候
对于这种情况,只需添加初始值和结束值,如果合适的话:
<your query here>
union all
select '2018-04-01 00:00:000', min(lit.hourImported)
from logsImportedTable lit
where lit.hourImported >= '2018-04-01 00:00:00'
having min(lit.hourImported) > '2018-04-01 00:00:00'
union all
select '2018-05-01 00:00:000', max(lit.hourImported)
from logsImportedTable lit
where lit.hourImported <= '2018-05-01 00:00:00'
having max(lit.hourImported) > '2018-05-01 00:00:00';
好的,感谢@Gordon 的大力帮助,这是我对问题的最终解决方案。即使缺少整个月的数据并且其中的所有小间隙,它也会给出间隙。
DECLARE @zone INT = 1, @currentPeriodStart DATETIME = '2018-01-01',
@currentPeriodEnd DATETIME = '2018-02-01';
WITH t AS (
SELECT *, rn = ROW_NUMBER() over (PARTITION BY zone_id ORDER BY
time_of_file_present)
FROM test
Where time_of_file_present > @currentPeriodStart and time_of_file_present <
@currentPeriodEnd and zone_id = @zone
)
SELECT t1.zone_id, t1.time_of_file_present as gap_start,
t2.time_of_file_present as gap_end
FROM t t1
INNER JOIN t t2 ON t2.zone_id = t1.zone_id AND t2.rn = t1.rn + 1
WHERE DATEDIFF(MINUTE, t1.time_of_file_present, t2.time_of_file_present) >60
union all
select @zone, @currentPeriodStart, min(lit.time_of_file_present)
from test lit
where lit.time_of_file_present >= @currentPeriodStart
having min(lit.time_of_file_present) > @currentPeriodStart and
min(lit.time_of_file_present) < @currentPeriodEnd
union all
select @zone,max(lit.time_of_file_present), @currentPeriodEnd
from test lit
where lit.time_of_file_present <= @currentPeriodEnd
having max(lit.time_of_file_present) < @currentPeriodEnd and
max(lit.time_of_file_present) > @currentPeriodStart
union all
select @zone,@currentPeriodStart, @currentPeriodEnd
from test lit
having max(lit.time_of_file_present) < @currentPeriodStart or
max(lit.time_of_file_present) > @currentPeriodEnd
order by gap_start