如何在 SQL 中创建一个迭代器来对行进行计数,就好像它们在一个集合中一样?
How can one create an iterator in SQL that counts through rows as if they are in a set?
我一直在寻找在单个 UPDATE 语句中执行此操作的方法,但没有成功。
这是我正在使用的数据集的示例:
+-------------------------+----------+--------------+----+--------+
| TIMESTAMP | USERNAME | VALUE | ID | IsDupe |
+-------------------------+----------+--------------+----+--------+
| 2020-02-12 07:00:03.000 | LINA | ORDER1 | 1 | 0 |
| 2020-02-12 07:00:03.000 | LINA | ITEM1 | 2 | 0 |
| 2020-02-12 07:09:09.000 | LINA | FINISH BUILD | 3 | 0 |
| 2020-02-12 07:09:10.000 | LINA | ORDER1 | 4 | 0 |
| 2020-02-12 07:09:11.000 | LINA | ITEM2 | 5 | 0 |
| 2020-02-12 07:24:07.000 | LINA | FINISH BUILD | 6 | 0 |
| 2020-02-12 07:24:08.000 | NAGA | ORDER2 | 7 | 0 |
| 2020-02-12 07:24:10.000 | NAGA | ITEM3 | 8 | 0 |
| 2020-02-12 07:45:06.000 | NAGA | FINISH BUILD | 9 | 0 |
| 2020-02-12 07:45:12.000 | NAGA | FINISH BUILD | 10 | 1 |
| 2020-02-12 07:45:13.000 | XELLOS | ORDER3 | 11 | 0 |
| 2020-02-12 07:45:14.000 | XELLOS | ITEM4 | 12 | 0 |
| 2020-02-12 07:56:36.000 | XELLOS | FINISH BUILD | 13 | 0 |
| 2020-02-12 07:56:39.000 | GOURRY | ORDER4 | 14 | 0 |
| 2020-02-12 07:56:40.000 | GOURRY | ITEM5 | 15 | 0 |
| 2020-02-12 08:30:11.000 | GOURRY | FINISH BUILD | 17 | 0 |
+-------------------------+----------+--------------+----+--------+
我想要做的是创建一个额外的列作为迭代器,将这些行中的每一行分成三组,如下所示:
+-------------------------+----------+--------------+-------+--------+-------+
| TIMESTAMP | USERNAME | VALUE | IDCol | IsDupe | SetID |
+-------------------------+----------+--------------+-------+--------+-------+
| 2020-02-12 07:00:03.000 | LINA | ORDER1 | 1 | 0 | 1 |
| 2020-02-12 07:00:03.000 | LINA | ITEM1 | 2 | 0 | 1 |
| 2020-02-12 07:09:09.000 | LINA | FINISH BUILD | 3 | 0 | 1 |
| 2020-02-12 07:09:10.000 | LINA | ORDER1 | 4 | 0 | 2 |
| 2020-02-12 07:09:11.000 | LINA | ITEM2 | 5 | 0 | 2 |
| 2020-02-12 07:24:07.000 | LINA | FINISH BUILD | 6 | 0 | 2 |
| 2020-02-12 07:24:08.000 | NAGA | ORDER2 | 7 | 0 | 3 |
| 2020-02-12 07:24:10.000 | NAGA | ITEM3 | 8 | 0 | 3 |
| 2020-02-12 07:45:06.000 | NAGA | FINISH BUILD | 9 | 0 | 3 |
| 2020-02-12 07:45:12.000 | NAGA | FINISH BUILD | 10 | 1 | NULL |
| 2020-02-12 07:45:13.000 | XELLOS | ORDER3 | 11 | 0 | 4 |
| 2020-02-12 07:45:14.000 | XELLOS | ITEM4 | 12 | 0 | 4 |
| 2020-02-12 07:56:36.000 | XELLOS | FINISH BUILD | 13 | 0 | 4 |
| 2020-02-12 07:56:39.000 | GOURRY | ORDER4 | 14 | 0 | 5 |
| 2020-02-12 07:56:40.000 | GOURRY | ITEM5 | 15 | 0 | 5 |
| 2020-02-12 08:30:11.000 | GOURRY | FINISH BUILD | 17 | 0 | 5 |
+-------------------------+----------+--------------+-------+--------+-------+
我曾尝试在 SQL 中查找迭代语句,但对性能有很大的担忧,因为这将是一个相对较大的数据集,并且该语句需要 运行白天,影响生产。
另请注意,数据集中可能包含重复项或其他错误。此语句必须忽略 IsDupe 设置为 1 的行。
我一直在尝试构建游标来执行此操作,但 运行 遇到了许多语法问题以及一般缺乏编写游标的经验:
DECLARE @MyCursor CURSOR;
DECLARE @SetID INT;
DECLARE @OUTPUTNUM TINYINT;
DECLARE @COUNTER TINYINT;
BEGIN
SET @MyCursor = CURSOR LOCAL FAST_FORWARD FOR
SELECT IsDupe from dbo.MyDataTable
WHERE IsDupe != 1
OPEN @MyCursor
FETCH NEXT FROM @MyCursor INTO @SetID
WHILE @@FETCH_STATUS = 0 BEGIN
SET @COUNTER = 0;
SET @OUTPUTNUM = 1;
WHILE @COUNTER < 3
BEGIN
UPDATE dbo.MyDataTable SET dbo.MyDataTable.SetID = @OUTPUTNUM
SET @COUNTER = @COUNTER + 1
END
SET @COUNTER = 0;
SET @OUTPUTNUM = @OUTPUTNUM + 1
FETCH NEXT FROM @MyCursor
INTO @SetID
END;
CLOSE @MyCursor ;
DEALLOCATE @MyCursor;
END;
当我 运行 执行此操作时,我收到以下消息:
[2:07:54 PM] Started executing query at Line 1
Commands completed successfully.
Total execution time: 00:00:00.026
但是没有结果,SetID列的值仍然全部为null。
您可以使用 windows 函数在没有光标的情况下完成此操作:
select [TIMESTAMP], USERNAME, VALUE, ID, IsDupe,
case
when IsDupe = 1 then null
else DENSE_RANK()over(order by GroupID)
end as SetID
from(
select
*,
case when value like 'ORDER%' then ID
when value like 'ITEM%' then lag(ID,1)over (order by ID)
when value like 'FINISH BUILD%' then lag(ID,2)over (order by ID)
end as GroupID
from #tmp where IsDupe = 0
)a
union
select
[TIMESTAMP], USERNAME, VALUE, ID, IsDupe, null as SetID
from #tmp where IsDupe = 1
order by ID
这是我的完整示例:
drop table #tmp
select '2020-02-12 07:00:03.000' as TIMESTAMP, 'LINA' as USERNAME , 'ORDER1' as VALUE , 1 as ID , 0 as IsDupe into #tmp
union select '2020-02-12 07:00:03.000' , 'LINA' , 'ITEM1' , 2 , 0
union select '2020-02-12 07:09:09.000' , 'LINA' , 'FINISH BUILD' , 3 , 0
union select '2020-02-12 07:09:10.000' , 'LINA' , 'ORDER1' , 4 , 0
union select '2020-02-12 07:09:11.000' , 'LINA' , 'ITEM2' , 5 , 0
union select '2020-02-12 07:24:07.000' , 'LINA' , 'FINISH BUILD' , 6 , 0
union select '2020-02-12 07:24:08.000' , 'NAGA' , 'ORDER2' , 7 , 0
union select '2020-02-12 07:24:10.000' , 'NAGA' , 'ITEM3' , 8 , 0
union select '2020-02-12 07:45:06.000' , 'NAGA' , 'FINISH BUILD' , 9 , 0
union select '2020-02-12 07:45:12.000' , 'NAGA' , 'FINISH BUILD' , 10 , 1
union select '2020-02-12 07:45:13.000' , 'XELLOS' , 'ORDER3' , 11 , 0
union select '2020-02-12 07:45:14.000' , 'XELLOS' , 'ITEM4' , 12 , 0
union select '2020-02-12 07:56:36.000' , 'XELLOS' , 'FINISH BUILD' , 13 , 0
union select '2020-02-12 07:56:39.000' , 'GOURRY' , 'ORDER4' , 14 , 0
union select '2020-02-12 07:56:40.000' , 'GOURRY' , 'ITEM5' , 15 , 0
union select '2020-02-12 08:30:11.000' , 'GOURRY' , 'FINISH BUILD' , 17 , 0
order by ID
select [TIMESTAMP], USERNAME, VALUE, ID, IsDupe,
case
when IsDupe = 1 then null
else DENSE_RANK()over(order by GroupID)
end as SetID
from(
select
*,
case when value like 'ORDER%' then ID
when value like 'ITEM%' then lag(ID,1)over (order by ID)
when value like 'FINISH BUILD%' then lag(ID,2)over (order by ID)
end as GroupID
from #tmp where IsDupe = 0
)a
union
select
[TIMESTAMP], USERNAME, VALUE, ID, IsDupe, null as SetID
from #tmp where IsDupe = 1
order by ID
输出:
TIMESTAMP USERNAME VALUE IDCol IsDupe SetID
2020-02-12 07:00:03.000 LINA ORDER1 1 0 1
2020-02-12 07:00:03.000 LINA ITEM1 2 0 1
2020-02-12 07:09:09.000 LINA FINISH BUILD 3 0 1
2020-02-12 07:09:10.000 LINA ORDER1 4 0 2
2020-02-12 07:09:11.000 LINA ITEM2 5 0 2
2020-02-12 07:24:07.000 LINA FINISH BUILD 6 0 2
2020-02-12 07:24:08.000 NAGA ORDER2 7 0 3
2020-02-12 07:24:10.000 NAGA ITEM3 8 0 3
2020-02-12 07:45:06.000 NAGA FINISH BUILD 9 0 3
2020-02-12 07:45:12.000 NAGA FINISH BUILD 10 1 NULL
2020-02-12 07:45:13.000 XELLOS ORDER3 11 0 4
2020-02-12 07:45:14.000 XELLOS ITEM4 12 0 4
2020-02-12 07:56:36.000 XELLOS FINISH BUILD 13 0 4
2020-02-12 07:56:39.000 GOURRY ORDER4 14 0 5
2020-02-12 07:56:40.000 GOURRY ITEM5 15 0 5
2020-02-12 08:30:11.000 GOURRY FINISH BUILD 17 0 5
我一直在寻找在单个 UPDATE 语句中执行此操作的方法,但没有成功。
这是我正在使用的数据集的示例:
+-------------------------+----------+--------------+----+--------+
| TIMESTAMP | USERNAME | VALUE | ID | IsDupe |
+-------------------------+----------+--------------+----+--------+
| 2020-02-12 07:00:03.000 | LINA | ORDER1 | 1 | 0 |
| 2020-02-12 07:00:03.000 | LINA | ITEM1 | 2 | 0 |
| 2020-02-12 07:09:09.000 | LINA | FINISH BUILD | 3 | 0 |
| 2020-02-12 07:09:10.000 | LINA | ORDER1 | 4 | 0 |
| 2020-02-12 07:09:11.000 | LINA | ITEM2 | 5 | 0 |
| 2020-02-12 07:24:07.000 | LINA | FINISH BUILD | 6 | 0 |
| 2020-02-12 07:24:08.000 | NAGA | ORDER2 | 7 | 0 |
| 2020-02-12 07:24:10.000 | NAGA | ITEM3 | 8 | 0 |
| 2020-02-12 07:45:06.000 | NAGA | FINISH BUILD | 9 | 0 |
| 2020-02-12 07:45:12.000 | NAGA | FINISH BUILD | 10 | 1 |
| 2020-02-12 07:45:13.000 | XELLOS | ORDER3 | 11 | 0 |
| 2020-02-12 07:45:14.000 | XELLOS | ITEM4 | 12 | 0 |
| 2020-02-12 07:56:36.000 | XELLOS | FINISH BUILD | 13 | 0 |
| 2020-02-12 07:56:39.000 | GOURRY | ORDER4 | 14 | 0 |
| 2020-02-12 07:56:40.000 | GOURRY | ITEM5 | 15 | 0 |
| 2020-02-12 08:30:11.000 | GOURRY | FINISH BUILD | 17 | 0 |
+-------------------------+----------+--------------+----+--------+
我想要做的是创建一个额外的列作为迭代器,将这些行中的每一行分成三组,如下所示:
+-------------------------+----------+--------------+-------+--------+-------+
| TIMESTAMP | USERNAME | VALUE | IDCol | IsDupe | SetID |
+-------------------------+----------+--------------+-------+--------+-------+
| 2020-02-12 07:00:03.000 | LINA | ORDER1 | 1 | 0 | 1 |
| 2020-02-12 07:00:03.000 | LINA | ITEM1 | 2 | 0 | 1 |
| 2020-02-12 07:09:09.000 | LINA | FINISH BUILD | 3 | 0 | 1 |
| 2020-02-12 07:09:10.000 | LINA | ORDER1 | 4 | 0 | 2 |
| 2020-02-12 07:09:11.000 | LINA | ITEM2 | 5 | 0 | 2 |
| 2020-02-12 07:24:07.000 | LINA | FINISH BUILD | 6 | 0 | 2 |
| 2020-02-12 07:24:08.000 | NAGA | ORDER2 | 7 | 0 | 3 |
| 2020-02-12 07:24:10.000 | NAGA | ITEM3 | 8 | 0 | 3 |
| 2020-02-12 07:45:06.000 | NAGA | FINISH BUILD | 9 | 0 | 3 |
| 2020-02-12 07:45:12.000 | NAGA | FINISH BUILD | 10 | 1 | NULL |
| 2020-02-12 07:45:13.000 | XELLOS | ORDER3 | 11 | 0 | 4 |
| 2020-02-12 07:45:14.000 | XELLOS | ITEM4 | 12 | 0 | 4 |
| 2020-02-12 07:56:36.000 | XELLOS | FINISH BUILD | 13 | 0 | 4 |
| 2020-02-12 07:56:39.000 | GOURRY | ORDER4 | 14 | 0 | 5 |
| 2020-02-12 07:56:40.000 | GOURRY | ITEM5 | 15 | 0 | 5 |
| 2020-02-12 08:30:11.000 | GOURRY | FINISH BUILD | 17 | 0 | 5 |
+-------------------------+----------+--------------+-------+--------+-------+
我曾尝试在 SQL 中查找迭代语句,但对性能有很大的担忧,因为这将是一个相对较大的数据集,并且该语句需要 运行白天,影响生产。
另请注意,数据集中可能包含重复项或其他错误。此语句必须忽略 IsDupe 设置为 1 的行。
我一直在尝试构建游标来执行此操作,但 运行 遇到了许多语法问题以及一般缺乏编写游标的经验:
DECLARE @MyCursor CURSOR;
DECLARE @SetID INT;
DECLARE @OUTPUTNUM TINYINT;
DECLARE @COUNTER TINYINT;
BEGIN
SET @MyCursor = CURSOR LOCAL FAST_FORWARD FOR
SELECT IsDupe from dbo.MyDataTable
WHERE IsDupe != 1
OPEN @MyCursor
FETCH NEXT FROM @MyCursor INTO @SetID
WHILE @@FETCH_STATUS = 0 BEGIN
SET @COUNTER = 0;
SET @OUTPUTNUM = 1;
WHILE @COUNTER < 3
BEGIN
UPDATE dbo.MyDataTable SET dbo.MyDataTable.SetID = @OUTPUTNUM
SET @COUNTER = @COUNTER + 1
END
SET @COUNTER = 0;
SET @OUTPUTNUM = @OUTPUTNUM + 1
FETCH NEXT FROM @MyCursor
INTO @SetID
END;
CLOSE @MyCursor ;
DEALLOCATE @MyCursor;
END;
当我 运行 执行此操作时,我收到以下消息:
[2:07:54 PM] Started executing query at Line 1
Commands completed successfully.
Total execution time: 00:00:00.026
但是没有结果,SetID列的值仍然全部为null。
您可以使用 windows 函数在没有光标的情况下完成此操作:
select [TIMESTAMP], USERNAME, VALUE, ID, IsDupe,
case
when IsDupe = 1 then null
else DENSE_RANK()over(order by GroupID)
end as SetID
from(
select
*,
case when value like 'ORDER%' then ID
when value like 'ITEM%' then lag(ID,1)over (order by ID)
when value like 'FINISH BUILD%' then lag(ID,2)over (order by ID)
end as GroupID
from #tmp where IsDupe = 0
)a
union
select
[TIMESTAMP], USERNAME, VALUE, ID, IsDupe, null as SetID
from #tmp where IsDupe = 1
order by ID
这是我的完整示例:
drop table #tmp
select '2020-02-12 07:00:03.000' as TIMESTAMP, 'LINA' as USERNAME , 'ORDER1' as VALUE , 1 as ID , 0 as IsDupe into #tmp
union select '2020-02-12 07:00:03.000' , 'LINA' , 'ITEM1' , 2 , 0
union select '2020-02-12 07:09:09.000' , 'LINA' , 'FINISH BUILD' , 3 , 0
union select '2020-02-12 07:09:10.000' , 'LINA' , 'ORDER1' , 4 , 0
union select '2020-02-12 07:09:11.000' , 'LINA' , 'ITEM2' , 5 , 0
union select '2020-02-12 07:24:07.000' , 'LINA' , 'FINISH BUILD' , 6 , 0
union select '2020-02-12 07:24:08.000' , 'NAGA' , 'ORDER2' , 7 , 0
union select '2020-02-12 07:24:10.000' , 'NAGA' , 'ITEM3' , 8 , 0
union select '2020-02-12 07:45:06.000' , 'NAGA' , 'FINISH BUILD' , 9 , 0
union select '2020-02-12 07:45:12.000' , 'NAGA' , 'FINISH BUILD' , 10 , 1
union select '2020-02-12 07:45:13.000' , 'XELLOS' , 'ORDER3' , 11 , 0
union select '2020-02-12 07:45:14.000' , 'XELLOS' , 'ITEM4' , 12 , 0
union select '2020-02-12 07:56:36.000' , 'XELLOS' , 'FINISH BUILD' , 13 , 0
union select '2020-02-12 07:56:39.000' , 'GOURRY' , 'ORDER4' , 14 , 0
union select '2020-02-12 07:56:40.000' , 'GOURRY' , 'ITEM5' , 15 , 0
union select '2020-02-12 08:30:11.000' , 'GOURRY' , 'FINISH BUILD' , 17 , 0
order by ID
select [TIMESTAMP], USERNAME, VALUE, ID, IsDupe,
case
when IsDupe = 1 then null
else DENSE_RANK()over(order by GroupID)
end as SetID
from(
select
*,
case when value like 'ORDER%' then ID
when value like 'ITEM%' then lag(ID,1)over (order by ID)
when value like 'FINISH BUILD%' then lag(ID,2)over (order by ID)
end as GroupID
from #tmp where IsDupe = 0
)a
union
select
[TIMESTAMP], USERNAME, VALUE, ID, IsDupe, null as SetID
from #tmp where IsDupe = 1
order by ID
输出:
TIMESTAMP USERNAME VALUE IDCol IsDupe SetID
2020-02-12 07:00:03.000 LINA ORDER1 1 0 1
2020-02-12 07:00:03.000 LINA ITEM1 2 0 1
2020-02-12 07:09:09.000 LINA FINISH BUILD 3 0 1
2020-02-12 07:09:10.000 LINA ORDER1 4 0 2
2020-02-12 07:09:11.000 LINA ITEM2 5 0 2
2020-02-12 07:24:07.000 LINA FINISH BUILD 6 0 2
2020-02-12 07:24:08.000 NAGA ORDER2 7 0 3
2020-02-12 07:24:10.000 NAGA ITEM3 8 0 3
2020-02-12 07:45:06.000 NAGA FINISH BUILD 9 0 3
2020-02-12 07:45:12.000 NAGA FINISH BUILD 10 1 NULL
2020-02-12 07:45:13.000 XELLOS ORDER3 11 0 4
2020-02-12 07:45:14.000 XELLOS ITEM4 12 0 4
2020-02-12 07:56:36.000 XELLOS FINISH BUILD 13 0 4
2020-02-12 07:56:39.000 GOURRY ORDER4 14 0 5
2020-02-12 07:56:40.000 GOURRY ITEM5 15 0 5
2020-02-12 08:30:11.000 GOURRY FINISH BUILD 17 0 5