SQL 查询以找到正确的引擎对 运行
SQL query to find correct pairs of engine running
我在现有 table 中获取了数据,其中包含活动 (a) 和非活动 (i) 等事件。如果组件处于活动状态或非活动状态,它就像记录日志一样。由于旧接口,没有正确的组件对。
Hier 是简短的示例数据库
"id" "component_number" "timestamp" "status"
"1" "1" "2020-05-10 16:30:00" "A"
"2" "1" "2020-05-18 16:34:05" "A"
"3" "1" "2020-05-19 16:36:01" "I"
"4" "1" "2020-05-19 16:36:52" "A"
"5" "1" "2020-05-19 16:38:57" "I"
"6" "2" "2020-05-11 17:04:50" "A"
"7" "2" "2020-05-15 10:00:00" "A"
"8" "2" "2020-05-16 11:25:16" "I"
例如,引擎 1 于 2020-05-10 16:30:00 启动(活动)并于 2020-05-19 16:36:01 停止(非活动)。但是我在 2020-05-18 16:34:05.
获得了一个额外的活动条目
当引擎是 运行 时,我必须找到正确的对。这将在示例中:
2020-05-10 16:30:00 和 2020-05-19 16:36:01。该列表不仅包括一个引擎,还可以有更多引擎。
我正在寻找一个查询字符串以获取正确的对(结果 1)或一个字符串以获取所需的事件(结果 2)。我不知道什么更容易?
结果 1:
"component_number" "start" "end"
"1" "2020-05-10 16:30:00" "2020-05-19 16:36:01"
"1" "2020-05-19 16:36:52" "2020-05-19 16:38:57"
"2" "2020-05-11 17:04:50" "2020-05-16 11:25:16"
结果二:
"id" "component_number" "timestamp" "status"
"1" "1" "2020-05-10 16:30:00" "A"
"3" "1" "2020-05-19 16:36:01" "I"
"4" "1" "2020-05-19 16:36:52" "A"
"5" "1" "2020-05-19 16:38:57" "I"
"6" "2" "2020-05-11 17:04:50" "A"
"8" "2" "2020-05-16 11:25:16" "I"
我尝试使用子查询并加入,但没有成功。有人知道或提示如何处理它吗?
这是一个缺口和孤岛问题。我建议使用 lag()
和 window sum()
来定义组。基本上,一个新组开始于每个 'A'
之前是 'I'
.
这为您提供了第一个结果集:
select
component_number,
min(timestamp) start_timestamp,
max(timestamp) end_timestamp
from (
select
t.*,
sum(case when status = 'A' and lag_status = 'I' then 1 else 0 end)
over(partition by component_number order by timestamp) grp
from (
select
t.*,
lag(status)
over(partition by component_number order by timestamp) lag_status
from mytable t
) t
) t
group by component_number, grp
第二个结果集需要较少的嵌套:
select id, component_number, timestamp, status
from (
select
t.*,
lag(status)
over(partition by component_number order by timestamp) lag_status
from mytable t
) t
where status = 'I' or lag_status is null or lag_status = 'I'
Demo on DB Fiddle (MariaDB 10.3):
component_number | start_timestamp | end_timestamp
---------------: | :------------------ | :------------------
1 | 2020-05-10 16:30:00 | 2020-05-19 16:36:01
1 | 2020-05-19 16:36:52 | 2020-05-19 16:38:57
2 | 2020-05-11 17:04:50 | 2020-05-16 11:25:16
id | component_number | timestamp | status
-: | ---------------: | :------------------ | :-----
1 | 1 | 2020-05-10 16:30:00 | A
3 | 1 | 2020-05-19 16:36:01 | I
4 | 1 | 2020-05-19 16:36:52 | A
5 | 1 | 2020-05-19 16:38:57 | I
6 | 2 | 2020-05-11 17:04:50 | A
8 | 2 | 2020-05-16 11:25:16 | I
我在现有 table 中获取了数据,其中包含活动 (a) 和非活动 (i) 等事件。如果组件处于活动状态或非活动状态,它就像记录日志一样。由于旧接口,没有正确的组件对。
Hier 是简短的示例数据库
"id" "component_number" "timestamp" "status"
"1" "1" "2020-05-10 16:30:00" "A"
"2" "1" "2020-05-18 16:34:05" "A"
"3" "1" "2020-05-19 16:36:01" "I"
"4" "1" "2020-05-19 16:36:52" "A"
"5" "1" "2020-05-19 16:38:57" "I"
"6" "2" "2020-05-11 17:04:50" "A"
"7" "2" "2020-05-15 10:00:00" "A"
"8" "2" "2020-05-16 11:25:16" "I"
例如,引擎 1 于 2020-05-10 16:30:00 启动(活动)并于 2020-05-19 16:36:01 停止(非活动)。但是我在 2020-05-18 16:34:05.
获得了一个额外的活动条目当引擎是 运行 时,我必须找到正确的对。这将在示例中: 2020-05-10 16:30:00 和 2020-05-19 16:36:01。该列表不仅包括一个引擎,还可以有更多引擎。
我正在寻找一个查询字符串以获取正确的对(结果 1)或一个字符串以获取所需的事件(结果 2)。我不知道什么更容易?
结果 1:
"component_number" "start" "end"
"1" "2020-05-10 16:30:00" "2020-05-19 16:36:01"
"1" "2020-05-19 16:36:52" "2020-05-19 16:38:57"
"2" "2020-05-11 17:04:50" "2020-05-16 11:25:16"
结果二:
"id" "component_number" "timestamp" "status"
"1" "1" "2020-05-10 16:30:00" "A"
"3" "1" "2020-05-19 16:36:01" "I"
"4" "1" "2020-05-19 16:36:52" "A"
"5" "1" "2020-05-19 16:38:57" "I"
"6" "2" "2020-05-11 17:04:50" "A"
"8" "2" "2020-05-16 11:25:16" "I"
我尝试使用子查询并加入,但没有成功。有人知道或提示如何处理它吗?
这是一个缺口和孤岛问题。我建议使用 lag()
和 window sum()
来定义组。基本上,一个新组开始于每个 'A'
之前是 'I'
.
这为您提供了第一个结果集:
select
component_number,
min(timestamp) start_timestamp,
max(timestamp) end_timestamp
from (
select
t.*,
sum(case when status = 'A' and lag_status = 'I' then 1 else 0 end)
over(partition by component_number order by timestamp) grp
from (
select
t.*,
lag(status)
over(partition by component_number order by timestamp) lag_status
from mytable t
) t
) t
group by component_number, grp
第二个结果集需要较少的嵌套:
select id, component_number, timestamp, status
from (
select
t.*,
lag(status)
over(partition by component_number order by timestamp) lag_status
from mytable t
) t
where status = 'I' or lag_status is null or lag_status = 'I'
Demo on DB Fiddle (MariaDB 10.3):
component_number | start_timestamp | end_timestamp ---------------: | :------------------ | :------------------ 1 | 2020-05-10 16:30:00 | 2020-05-19 16:36:01 1 | 2020-05-19 16:36:52 | 2020-05-19 16:38:57 2 | 2020-05-11 17:04:50 | 2020-05-16 11:25:16
id | component_number | timestamp | status -: | ---------------: | :------------------ | :----- 1 | 1 | 2020-05-10 16:30:00 | A 3 | 1 | 2020-05-19 16:36:01 | I 4 | 1 | 2020-05-19 16:36:52 | A 5 | 1 | 2020-05-19 16:38:57 | I 6 | 2 | 2020-05-11 17:04:50 | A 8 | 2 | 2020-05-16 11:25:16 | I