Islands and Gaps 算法不会为每个岛和间隙生成一个全局唯一的 id
Islands and Gaps algorithm does not produce a globally unique id for each island and gap
我正在使用标准 Islands and Gaps 算法来查找连续值块(1 或 0)。 ProductionState 列表示根据连接到机器的传感器的读数生产或不生产的时间段。相关步骤包含在这个 Common Table 元素中:
-- Production state islands with unique Id
production_state_03( Timestamp, ProductionState, ProductionStateIslandId ) as
(
select
Timestamp,
ProductionState,
row_number() over ( order by Timestamp ) - row_number() over ( partition by ProductionState order by ProductionState )
from production_state_02
)
结果如下table:
问题是每个岛或间隙的 ProductionStateIslandId 不一定是全局唯一的,这会导致后面的分析步骤出错。是否有不同的方法来计算 Islands 和 Gaps 总是会产生全局唯一的 Id 值?
这件事:
row_number() over ( partition by ProductionState order by ProductionState )
没有意义。它所做的只是创建一个 seem-be-be-ordered、in-reality-random 数字。
您的差距不寻常,因为它们不是真正的差距,0 值行仍然存在。也许条件求和会有所帮助:
row_number() over ( order by Timestamp ) - sum(ProductionState) over (order by Timestamp)
第二个row_number也应该按时间戳排序。
row_number() over (order by [Timestamp])
- row_number() over (partition by ProductionState
order by [Timestamp])
或
row_number() over (order by [Timestamp])
+ row_number() over (partition by ProductionState
order by [Timestamp] DESC)
但该更正不会使其在全球范围内独一无二。
计算此类排名的另一种方法是对更改标志求和。
production_state_03 ([Timestamp], ProductionState, ProductionStateIslandId) as
(
select [Timestamp], ProductionState
, rnk = SUM(flag) over (order by [Timestamp])
from
(
select [Timestamp], ProductionState
, flag = IIF(ProductionState = LAG(ProductionState) over (order by [Timestamp]), 0, 1)
from production_state_02
) q
)
这个 Gaps-And-Islands 解决技巧确实需要一个额外的子查询,但排名是连续的。
我正在使用标准 Islands and Gaps 算法来查找连续值块(1 或 0)。 ProductionState 列表示根据连接到机器的传感器的读数生产或不生产的时间段。相关步骤包含在这个 Common Table 元素中:
-- Production state islands with unique Id
production_state_03( Timestamp, ProductionState, ProductionStateIslandId ) as
(
select
Timestamp,
ProductionState,
row_number() over ( order by Timestamp ) - row_number() over ( partition by ProductionState order by ProductionState )
from production_state_02
)
结果如下table:
问题是每个岛或间隙的 ProductionStateIslandId 不一定是全局唯一的,这会导致后面的分析步骤出错。是否有不同的方法来计算 Islands 和 Gaps 总是会产生全局唯一的 Id 值?
这件事:
row_number() over ( partition by ProductionState order by ProductionState )
没有意义。它所做的只是创建一个 seem-be-be-ordered、in-reality-random 数字。
您的差距不寻常,因为它们不是真正的差距,0 值行仍然存在。也许条件求和会有所帮助:
row_number() over ( order by Timestamp ) - sum(ProductionState) over (order by Timestamp)
第二个row_number也应该按时间戳排序。
row_number() over (order by [Timestamp])
- row_number() over (partition by ProductionState
order by [Timestamp])
或
row_number() over (order by [Timestamp])
+ row_number() over (partition by ProductionState
order by [Timestamp] DESC)
但该更正不会使其在全球范围内独一无二。
计算此类排名的另一种方法是对更改标志求和。
production_state_03 ([Timestamp], ProductionState, ProductionStateIslandId) as
(
select [Timestamp], ProductionState
, rnk = SUM(flag) over (order by [Timestamp])
from
(
select [Timestamp], ProductionState
, flag = IIF(ProductionState = LAG(ProductionState) over (order by [Timestamp]), 0, 1)
from production_state_02
) q
)
这个 Gaps-And-Islands 解决技巧确实需要一个额外的子查询,但排名是连续的。