我可以为我的集合计算 SQL 中的 4 个不同的组吗?

Can I calculate 4 distinct groups in SQL for my set?

我有以下问题,必须在 SQL 中解决。数据库在 SQL-Server 2005 上运行。 对于一组(年份和数字),该组应该是唯一的,例如在下面的集合中,我应该有 4 个组 (1 - 4) 而不是组 (1-2)。 我无法更新 table,也无法修复应用程序。

我可以为我的集合计算 4 个不同的组,还是有不同的方法来解决这个问题?

year    |number |row    |group<br>
2004    |1000   |1  |1<br>
2004    |1000   |2  |1<br>
2004    |1000   |3  |1<br>
2004    |1000   |4  |1<br>
2004    |1000   |5  |2<br>
2004    |1000   |6  |2<br>
2004    |1000   |7  |2<br>
2004    |1000   |8  |2<br>
2004    |1000   |9  |1<br>
2004    |1000   |10 |1<br>
2004    |1000   |11 |1<br>
2004    |1000   |12 |1<br>
2004    |1000   |13 |2<br>
2004    |1000   |14 |2<br>
2004    |1000   |15 |2<br>
2004    |1000   |16 |2<br>

如果我理解正确,您可以使用 window 函数。对于这个特定问题,您可以对行号使用算术。对于更一般的解决方案,这是一个 "gaps-and-islands" 问题。在 SQL Server 2005 中,您可以使用 window 函数解决此问题:

select t.*,
       dense_rank() over (partition by year, number order by group, seqnum_r - seqnum_rg) as grp
from (select t.*,
             row_number() over (partition by year, number order by row) as seqnum_r,
             row_number() over (partition by year, number, group order by row) as seqnum_rg
      from t
     ) t;

理解为什么这会起作用有点棘手。重要的一点是,对于相似值的 "islands" ,两个序列号之间的差异是恒定的。 dense_rank() 然后只提供最终值。

以上将产生四组,但它们的顺序可能不正确。为此,使用另一轮 window 函数来获得最小值 id:

select t.*,
       dense_rank() over (partition by year, number order by minid) as grp
from (select t.*, min(id) over (partition by year, number, group, seqnum_r - seqnum_rg) as minid
      from (select t.*,
                   row_number() over (partition by year, number order by row) as seqnum_r,
                   row_number() over (partition by year, number, group order by row) as seqnum_rg
            from t
           ) t
     ) t;