如何对连续的行进行分组?
how to group consecutive rows?
所以,我有一个 table 行如下:
Ev_Message Ev_Comment EV_Custom1 Ev_Time_Ms
-------------------------------------------------------------------------------------
Machine 1 Alarm 5/23/2016 11:02:00 AM Alarms Scanned 25
Machine 1 Alarm 5/23/2016 11:00:00 AM Alarms Scanned 686
Machine 1 Alarm 5/23/2016 11:00:00 AM Light curtain 537
Machine 1 Alarm 5/23/2016 11:00:00 AM Guard door open 346
Machine 1 Alarm 5/23/2016 11:00:00 AM No control voltage 135
Machine 1 Alarm 5/23/2016 10:38:34 AM Alarms Scanned 269
Machine 1 Alarm 5/23/2016 10:38:29 AM Alarms Scanned 378
Machine 1 Alarm 5/23/2016 10:38:29 AM Guard door open 156
Machine 1 Alarm 5/23/2016 10:38:25 AM Alarms Scanned 654
Not an Alarm 5/23/2016 10:38:25 AM Not an Alarm 467
Machine 1 Alarm 5/23/2016 10:38:25 AM Guard door open 234
Machine 1 Alarm 5/23/2016 10:38:25 AM No control voltage 67
Machine 1 Alarm 5/23/2016 10:38:23 AM Alarms Scanned 124
Machine 1 Alarm 5/23/2016 10:38:23 AM No control voltage 100
每次扫描告警时添加"Alarms Scanned"行,即每次触发或清除告警时。任何警报都会添加带有特定 Ev_Custom1 的行。第一列 Ev_Message,包含一个机器 ID,可以让我从不同的机器中分离出警报。 (你不喜欢随意的列名吗?)有超过九百条独特的警报消息。
我希望我的查询 return 是这样的:
Alarm Message Alarm Start Time Alarm Stop Time
----------------------------------------------------------------
No control voltage 5/23/2016 10:38:23 AM 5/23/2016 10:38:29 AM
Guard door open 5/23/2016 10:38:25 AM 5/23/2016 10:38:34 AM
No control voltage 5/23/2016 11:00:00 AM 5/23/2016 11:02:00 AM
Guard door open 5/23/2016 11:00:00 AM 5/23/2016 11:02:00 AM
Light curtain 5/23/2016 11:00:00 AM 5/23/2016 11:02:00 AM
这将是在两个日期之间过滤的查询。我有一些能力可以更改进入 table 的数据,但是有 900 个警报,我的自由是有限的。
在一些帮助下,我当前的查询是这样的:
WITH T AS (
SELECT s.Ev_Comment AS start_time,
MIN(COALESCE (e.Ev_Comment, s.Ev_Comment)) AS end_time
FROM A AS s
INNER JOIN A AS e
ON s.Ev_Comment < e.Ev_Comment
AND s.Ev_Custom1 = 'Alarms Scanned'
AND e.Ev_Custom1 = 'Alarms Scanned'
GROUP BY s.Ev_Comment)
SELECT T_1.start_time,
T_1.end_time,
A.Ev_Custom1
FROM A
INNER JOIN T AS T_1
ON A.Ev_Comment LIKE T_1.start_time
WHERE (A.Ev_Custom1 <> 'Alarms Scanned')
我还有一个问题。如果警报持续时间超过一个周期,例如从 10:38:25 到 10:38:34 的 'Guard Door Open',那么它将显示在两个单独的行中,如下所示:
start_time end_time EV_Custom1
--------------------- --------------------- -------------
5/23/2016 10:38:25 AM 5/23/2016 10:38:29 AM Guard door open
5/23/2016 10:38:29 AM 5/23/2016 10:38:34 AM Guard door open
理想情况下我想要的是:
start_time end_time EV_Custom1
--------------------- --------------------- -------------
5/23/2016 10:38:25 AM 5/23/2016 10:38:34 AM Guard door open
我想我需要 group by ((Ev_custom1) and (when end_time = start_time))
(请原谅我的伪代码),但我对此所需的语法了解不够。
如果我正确理解发布的问题,那么您的 CTE 将有效地确定所有警报的时间段(或间隔)。您最后的 select 子句将实际警报信息与您的警报间隔结合起来。
您的部分问题是,如果您的警报长时间保持活动状态(我假设比您的警报扫描周期更长),您的警报系统将继续记录“已扫描警报”条目,这会有效地导致活动警报被拆分。
如果您有 SQL Server 2012 或更高版本,那么确定警报事件是否拆分相对容易。您只需要检查一个警报的结束时间是否等于下一个相同警报类型的警报的开始时间。您可以在 2012 年使用 LAG 窗口函数实现此目的。
下一步是生成一个 ID,您可以根据该 ID 对警报进行分组,以便您可以组合拆分事件。这是通过 SUM OVER 子句实现的。
以下示例显示了如何实现这一点:
;WITH AlarmTimeBuckets
AS
(
SELECT EventStart.Ev_Comment AS StartDateTime
,MIN(COALESCE (EventEnd.Ev_Comment, EventStart.Ev_Comment)) AS EndDateTime
,EventStart.Ev_Message As Machine
FROM A EventStart
INNER JOIN A EventEnd ON EventStart.Ev_Comment < EventEnd.Ev_Comment AND EventStart.Ev_Custom1 = 'Alarms Scanned' AND EventEnd.Ev_Custom1 = 'Alarms Scanned' AND EventStart.Ev_Message = EventEnd.Ev_Message
GROUP BY EventStart.Ev_Message, EventStart.Ev_Comment
),
AlarmsByTimeBucket
AS
(
SELECT AlarmTimeBuckets.Machine
,AlarmTimeBuckets.StartDateTime
,AlarmTimeBuckets.EndDateTime
,Alarm.Ev_Custom1 AS Alarm
,(
CASE
WHEN LAG(AlarmTimeBuckets.EndDateTime, 1, NULL) OVER (PARTITION BY Alarm.Ev_Custom1,Alarm.Ev_Message ORDER BY AlarmTimeBuckets.StartDateTime) = AlarmTimeBuckets.StartDateTime THEN 0
ELSE 1
END
) AS IsNewEvent
FROM A Alarm
INNER JOIN AlarmTimeBuckets ON Alarm.Ev_Message = AlarmTimeBuckets.Machine AND Alarm.Ev_Comment = AlarmTimeBuckets.StartDateTime
WHERE (Alarm.Ev_Custom1 <> 'Alarms Scanned')
)
,
AlarmsByGroupingID
AS
(
SELECT Machine
,StartDateTime
,EndDateTime
,Alarm
,SUM(IsNewEvent) OVER (ORDER BY Machine, Alarm, StartDateTime) AS GroupingID
FROM AlarmsByTimeBucket
)
SELECT MAX(Machine) AS Machine
,MIN(StartDateTime) AS StartDateTime
,MAX(EndDateTime) AS EndDateTime
,MAX(Alarm) AS Alarm
FROM AlarmsByGroupingID
GROUP BY GroupingID
ORDER BY StartDateTime
我更新了你的 sqlfiddle link 以及下面的更新。在您的最终结果集中,您需要设置一个 row_number 并在 EV_CUSTOM1、START_TIME = END_TIME 上重新加入它(正如您所怀疑的)以及行号 = 行数+1。这就是您如何确定两个事件是否在同一时期的方法。如果您使用的是 Sql Server 2012+,您可以使用 LAG/LEAD 功能,正如@EdmondQuinton 在他的回答中指出的那样,这会更简单一些。
WITH T AS (SELECT s.Ev_Comment AS start_time, MIN(COALESCE (e.Ev_Comment, s.Ev_Comment)) AS end_time
FROM A AS s
INNER JOIN A AS e
ON s.Ev_Comment < e.Ev_Comment
AND s.Ev_Custom1 = 'Alarms Scanned'
AND e.Ev_Custom1 = 'Alarms Scanned'
GROUP BY s.Ev_Comment
),
T2 AS(SELECT T_1.start_time, T_1.end_time, A.Ev_Custom1,
ROW_NUMBER() OVER (PARTITION BY EV_CUSTOM1 ORDER BY T_1.START_TIME) RN
FROM A
INNER JOIN
T AS T_1
ON A.Ev_Comment LIKE T_1.start_time
WHERE (A.Ev_Custom1 <> 'Alarms Scanned')
)
select
coalesce(b.START_TIME, a.START_TIME) START_TIME,
max(a.END_TIME) END_TIME,
a.EV_CUSTOM1
from T2 a
left outer join T2 b
on a.EV_CUSTOM1 = b.EV_CUSTOM1
and a.START_TIME = b.END_TIME
and a.RN = b.RN+1
group by coalesce(b.START_TIME, a.START_TIME),
a.EV_CUSTOM1
所以,我有一个 table 行如下:
Ev_Message Ev_Comment EV_Custom1 Ev_Time_Ms
-------------------------------------------------------------------------------------
Machine 1 Alarm 5/23/2016 11:02:00 AM Alarms Scanned 25
Machine 1 Alarm 5/23/2016 11:00:00 AM Alarms Scanned 686
Machine 1 Alarm 5/23/2016 11:00:00 AM Light curtain 537
Machine 1 Alarm 5/23/2016 11:00:00 AM Guard door open 346
Machine 1 Alarm 5/23/2016 11:00:00 AM No control voltage 135
Machine 1 Alarm 5/23/2016 10:38:34 AM Alarms Scanned 269
Machine 1 Alarm 5/23/2016 10:38:29 AM Alarms Scanned 378
Machine 1 Alarm 5/23/2016 10:38:29 AM Guard door open 156
Machine 1 Alarm 5/23/2016 10:38:25 AM Alarms Scanned 654
Not an Alarm 5/23/2016 10:38:25 AM Not an Alarm 467
Machine 1 Alarm 5/23/2016 10:38:25 AM Guard door open 234
Machine 1 Alarm 5/23/2016 10:38:25 AM No control voltage 67
Machine 1 Alarm 5/23/2016 10:38:23 AM Alarms Scanned 124
Machine 1 Alarm 5/23/2016 10:38:23 AM No control voltage 100
每次扫描告警时添加"Alarms Scanned"行,即每次触发或清除告警时。任何警报都会添加带有特定 Ev_Custom1 的行。第一列 Ev_Message,包含一个机器 ID,可以让我从不同的机器中分离出警报。 (你不喜欢随意的列名吗?)有超过九百条独特的警报消息。
我希望我的查询 return 是这样的:
Alarm Message Alarm Start Time Alarm Stop Time
----------------------------------------------------------------
No control voltage 5/23/2016 10:38:23 AM 5/23/2016 10:38:29 AM
Guard door open 5/23/2016 10:38:25 AM 5/23/2016 10:38:34 AM
No control voltage 5/23/2016 11:00:00 AM 5/23/2016 11:02:00 AM
Guard door open 5/23/2016 11:00:00 AM 5/23/2016 11:02:00 AM
Light curtain 5/23/2016 11:00:00 AM 5/23/2016 11:02:00 AM
这将是在两个日期之间过滤的查询。我有一些能力可以更改进入 table 的数据,但是有 900 个警报,我的自由是有限的。
在一些帮助下,我当前的查询是这样的:
WITH T AS (
SELECT s.Ev_Comment AS start_time,
MIN(COALESCE (e.Ev_Comment, s.Ev_Comment)) AS end_time
FROM A AS s
INNER JOIN A AS e
ON s.Ev_Comment < e.Ev_Comment
AND s.Ev_Custom1 = 'Alarms Scanned'
AND e.Ev_Custom1 = 'Alarms Scanned'
GROUP BY s.Ev_Comment)
SELECT T_1.start_time,
T_1.end_time,
A.Ev_Custom1
FROM A
INNER JOIN T AS T_1
ON A.Ev_Comment LIKE T_1.start_time
WHERE (A.Ev_Custom1 <> 'Alarms Scanned')
我还有一个问题。如果警报持续时间超过一个周期,例如从 10:38:25 到 10:38:34 的 'Guard Door Open',那么它将显示在两个单独的行中,如下所示:
start_time end_time EV_Custom1
--------------------- --------------------- -------------
5/23/2016 10:38:25 AM 5/23/2016 10:38:29 AM Guard door open
5/23/2016 10:38:29 AM 5/23/2016 10:38:34 AM Guard door open
理想情况下我想要的是:
start_time end_time EV_Custom1
--------------------- --------------------- -------------
5/23/2016 10:38:25 AM 5/23/2016 10:38:34 AM Guard door open
我想我需要 group by ((Ev_custom1) and (when end_time = start_time))
(请原谅我的伪代码),但我对此所需的语法了解不够。
如果我正确理解发布的问题,那么您的 CTE 将有效地确定所有警报的时间段(或间隔)。您最后的 select 子句将实际警报信息与您的警报间隔结合起来。
您的部分问题是,如果您的警报长时间保持活动状态(我假设比您的警报扫描周期更长),您的警报系统将继续记录“已扫描警报”条目,这会有效地导致活动警报被拆分。
如果您有 SQL Server 2012 或更高版本,那么确定警报事件是否拆分相对容易。您只需要检查一个警报的结束时间是否等于下一个相同警报类型的警报的开始时间。您可以在 2012 年使用 LAG 窗口函数实现此目的。
下一步是生成一个 ID,您可以根据该 ID 对警报进行分组,以便您可以组合拆分事件。这是通过 SUM OVER 子句实现的。
以下示例显示了如何实现这一点:
;WITH AlarmTimeBuckets
AS
(
SELECT EventStart.Ev_Comment AS StartDateTime
,MIN(COALESCE (EventEnd.Ev_Comment, EventStart.Ev_Comment)) AS EndDateTime
,EventStart.Ev_Message As Machine
FROM A EventStart
INNER JOIN A EventEnd ON EventStart.Ev_Comment < EventEnd.Ev_Comment AND EventStart.Ev_Custom1 = 'Alarms Scanned' AND EventEnd.Ev_Custom1 = 'Alarms Scanned' AND EventStart.Ev_Message = EventEnd.Ev_Message
GROUP BY EventStart.Ev_Message, EventStart.Ev_Comment
),
AlarmsByTimeBucket
AS
(
SELECT AlarmTimeBuckets.Machine
,AlarmTimeBuckets.StartDateTime
,AlarmTimeBuckets.EndDateTime
,Alarm.Ev_Custom1 AS Alarm
,(
CASE
WHEN LAG(AlarmTimeBuckets.EndDateTime, 1, NULL) OVER (PARTITION BY Alarm.Ev_Custom1,Alarm.Ev_Message ORDER BY AlarmTimeBuckets.StartDateTime) = AlarmTimeBuckets.StartDateTime THEN 0
ELSE 1
END
) AS IsNewEvent
FROM A Alarm
INNER JOIN AlarmTimeBuckets ON Alarm.Ev_Message = AlarmTimeBuckets.Machine AND Alarm.Ev_Comment = AlarmTimeBuckets.StartDateTime
WHERE (Alarm.Ev_Custom1 <> 'Alarms Scanned')
)
,
AlarmsByGroupingID
AS
(
SELECT Machine
,StartDateTime
,EndDateTime
,Alarm
,SUM(IsNewEvent) OVER (ORDER BY Machine, Alarm, StartDateTime) AS GroupingID
FROM AlarmsByTimeBucket
)
SELECT MAX(Machine) AS Machine
,MIN(StartDateTime) AS StartDateTime
,MAX(EndDateTime) AS EndDateTime
,MAX(Alarm) AS Alarm
FROM AlarmsByGroupingID
GROUP BY GroupingID
ORDER BY StartDateTime
我更新了你的 sqlfiddle link 以及下面的更新。在您的最终结果集中,您需要设置一个 row_number 并在 EV_CUSTOM1、START_TIME = END_TIME 上重新加入它(正如您所怀疑的)以及行号 = 行数+1。这就是您如何确定两个事件是否在同一时期的方法。如果您使用的是 Sql Server 2012+,您可以使用 LAG/LEAD 功能,正如@EdmondQuinton 在他的回答中指出的那样,这会更简单一些。
WITH T AS (SELECT s.Ev_Comment AS start_time, MIN(COALESCE (e.Ev_Comment, s.Ev_Comment)) AS end_time
FROM A AS s
INNER JOIN A AS e
ON s.Ev_Comment < e.Ev_Comment
AND s.Ev_Custom1 = 'Alarms Scanned'
AND e.Ev_Custom1 = 'Alarms Scanned'
GROUP BY s.Ev_Comment
),
T2 AS(SELECT T_1.start_time, T_1.end_time, A.Ev_Custom1,
ROW_NUMBER() OVER (PARTITION BY EV_CUSTOM1 ORDER BY T_1.START_TIME) RN
FROM A
INNER JOIN
T AS T_1
ON A.Ev_Comment LIKE T_1.start_time
WHERE (A.Ev_Custom1 <> 'Alarms Scanned')
)
select
coalesce(b.START_TIME, a.START_TIME) START_TIME,
max(a.END_TIME) END_TIME,
a.EV_CUSTOM1
from T2 a
left outer join T2 b
on a.EV_CUSTOM1 = b.EV_CUSTOM1
and a.START_TIME = b.END_TIME
and a.RN = b.RN+1
group by coalesce(b.START_TIME, a.START_TIME),
a.EV_CUSTOM1