我可以为我的集合计算 SQL 中的 4 个不同的组吗?
Can I calculate 4 distinct groups in SQL for my set?
我有以下问题,必须在 SQL 中解决。数据库在 SQL-Server 2005 上运行。
对于一组(年份和数字),该组应该是唯一的,例如在下面的集合中,我应该有 4 个组 (1 - 4) 而不是组 (1-2)。
我无法更新 table,也无法修复应用程序。
我可以为我的集合计算 4 个不同的组,还是有不同的方法来解决这个问题?
year |number |row |group<br>
2004 |1000 |1 |1<br>
2004 |1000 |2 |1<br>
2004 |1000 |3 |1<br>
2004 |1000 |4 |1<br>
2004 |1000 |5 |2<br>
2004 |1000 |6 |2<br>
2004 |1000 |7 |2<br>
2004 |1000 |8 |2<br>
2004 |1000 |9 |1<br>
2004 |1000 |10 |1<br>
2004 |1000 |11 |1<br>
2004 |1000 |12 |1<br>
2004 |1000 |13 |2<br>
2004 |1000 |14 |2<br>
2004 |1000 |15 |2<br>
2004 |1000 |16 |2<br>
如果我理解正确,您可以使用 window 函数。对于这个特定问题,您可以对行号使用算术。对于更一般的解决方案,这是一个 "gaps-and-islands" 问题。在 SQL Server 2005 中,您可以使用 window 函数解决此问题:
select t.*,
dense_rank() over (partition by year, number order by group, seqnum_r - seqnum_rg) as grp
from (select t.*,
row_number() over (partition by year, number order by row) as seqnum_r,
row_number() over (partition by year, number, group order by row) as seqnum_rg
from t
) t;
理解为什么这会起作用有点棘手。重要的一点是,对于相似值的 "islands" ,两个序列号之间的差异是恒定的。 dense_rank()
然后只提供最终值。
以上将产生四组,但它们的顺序可能不正确。为此,使用另一轮 window 函数来获得最小值 id
:
select t.*,
dense_rank() over (partition by year, number order by minid) as grp
from (select t.*, min(id) over (partition by year, number, group, seqnum_r - seqnum_rg) as minid
from (select t.*,
row_number() over (partition by year, number order by row) as seqnum_r,
row_number() over (partition by year, number, group order by row) as seqnum_rg
from t
) t
) t;
我有以下问题,必须在 SQL 中解决。数据库在 SQL-Server 2005 上运行。 对于一组(年份和数字),该组应该是唯一的,例如在下面的集合中,我应该有 4 个组 (1 - 4) 而不是组 (1-2)。 我无法更新 table,也无法修复应用程序。
我可以为我的集合计算 4 个不同的组,还是有不同的方法来解决这个问题?
year |number |row |group<br>
2004 |1000 |1 |1<br>
2004 |1000 |2 |1<br>
2004 |1000 |3 |1<br>
2004 |1000 |4 |1<br>
2004 |1000 |5 |2<br>
2004 |1000 |6 |2<br>
2004 |1000 |7 |2<br>
2004 |1000 |8 |2<br>
2004 |1000 |9 |1<br>
2004 |1000 |10 |1<br>
2004 |1000 |11 |1<br>
2004 |1000 |12 |1<br>
2004 |1000 |13 |2<br>
2004 |1000 |14 |2<br>
2004 |1000 |15 |2<br>
2004 |1000 |16 |2<br>
如果我理解正确,您可以使用 window 函数。对于这个特定问题,您可以对行号使用算术。对于更一般的解决方案,这是一个 "gaps-and-islands" 问题。在 SQL Server 2005 中,您可以使用 window 函数解决此问题:
select t.*,
dense_rank() over (partition by year, number order by group, seqnum_r - seqnum_rg) as grp
from (select t.*,
row_number() over (partition by year, number order by row) as seqnum_r,
row_number() over (partition by year, number, group order by row) as seqnum_rg
from t
) t;
理解为什么这会起作用有点棘手。重要的一点是,对于相似值的 "islands" ,两个序列号之间的差异是恒定的。 dense_rank()
然后只提供最终值。
以上将产生四组,但它们的顺序可能不正确。为此,使用另一轮 window 函数来获得最小值 id
:
select t.*,
dense_rank() over (partition by year, number order by minid) as grp
from (select t.*, min(id) over (partition by year, number, group, seqnum_r - seqnum_rg) as minid
from (select t.*,
row_number() over (partition by year, number order by row) as seqnum_r,
row_number() over (partition by year, number, group order by row) as seqnum_rg
from t
) t
) t;