SQL

Question

我在SQL Server中有如下数据集：

ROW_NUM  EMP_ID  DATE_KEY  TP_DAYS
1        U12345  20131003   1
2        U12345  20131004   0
3        U12345  20131005   0
4        U12345  20131006   0
5        U12345  20150627   1
6        U12345  20150628   0
1        U54321  20131003   1
2        U54321  20131004   0
3        U54321  20131005   0
4        U54321  20131006   0

我需要将 TP_DAYS 列中的所有零更新为以前的值增加 1。
所需的结果集如下：

ROW_NUM  EMP_ID  DATE_KEY  TP_DAYS
1        U12345  20131003   1
2        U12345  20131004   2
3        U12345  20131005   3
4        U12345  20131006   4
5        U12345  20150627   1
6        U12345  20150628   2
1        U54321  20131003   1
2        U54321  20131004   2
3        U54321  20131005   3
4        U54321  20131006   4

我尝试在 SQL 中使用 LAG 和 LEAD 函数。但是没能达到预期的效果。

谁能帮我实现一下。

Answer 1

让我假设 SQL Server 2012+。您需要识别由 1 分隔的组。计算组的一种简单方法是计算 1 的累加和。然后可以使用 row_number() 来计算新值。您可以使用可更新的 CTE 完成这项工作：

with toupdate as (
      select t.*,
             row_number() over (partition by empid, grp order by row_num) as new_tp_days
      from (select t.*, 
                   sum(tp_days) over (partition by emp_id order by row_num) as grp
            from t
           ) t
     )
update toupdate
    set tp_days = new_tp_days;

在 SQL 服务器的早期版本中，您可以完成同样的事情（效率较低）。一种方法使用 outer apply.

Answer 2

使用窗口函数（SUM/ROW_NUMBER 所以它将与 SQL Server 2008 一起工作）：

WITH cte AS
(
  SELECT *, s =  SUM(TP_DAYS) OVER(PARTITION BY EMP_ID ORDER BY ROW_NUM)
  FROM #tab
), cte2 AS
(
  SELECT *,
    tp_days_recalculated = ROW_NUMBER() OVER (PARTITION BY EMP_ID, s ORDER BY ROW_NUM)
  FROM cte
)
UPDATE cte2
SET TP_DAYS = tp_days_recalculated;

SELECT *
FROM #tab;

LiveDemo

输出：

╔═════════╦════════╦══════════╦═════════╗
║ ROW_NUM ║ EMP_ID ║ DATE_KEY ║ TP_DAYS ║
╠═════════╬════════╬══════════╬═════════╣
║       1 ║ U12345 ║ 20131003 ║       1 ║
║       2 ║ U12345 ║ 20131004 ║       2 ║
║       3 ║ U12345 ║ 20131005 ║       3 ║
║       4 ║ U12345 ║ 20131006 ║       4 ║
║       5 ║ U12345 ║ 20150627 ║       1 ║
║       6 ║ U12345 ║ 20150628 ║       2 ║
║       1 ║ U54321 ║ 20131003 ║       1 ║
║       2 ║ U54321 ║ 20131004 ║       2 ║
║       3 ║ U54321 ║ 20131005 ║       3 ║
║       4 ║ U54321 ║ 20131006 ║       4 ║
╚═════════╩════════╩══════════╩═════════╝

#附录

原始OP问题和示例数据非常清楚tp_days指标是0和1不是任何其他值。

特别是 Atheer Mostafa:

check this example as a proof: https://data.stackexchange.com/Whosebug/query/edit/423186

这应该是个新问题，但我会处理那个案例：

;WITH cte AS
(
  SELECT *
   ,rn = s +  ROW_NUMBER() OVER(PARTITION BY EMP_ID, s ORDER BY ROW_NUM) -1
   ,rnk = DENSE_RANK() OVER(PARTITION BY EMP_ID ORDER BY s)
  FROM (SELECT *, s =  SUM(tp_days) OVER(PARTITION BY EMP_ID ORDER BY ROW_NUM)
        FROM #tab) AS sub
), cte2 AS
(
  SELECT c1.*,
   tp_days_recalculated = c1.rn - (SELECT COALESCE(MAX(c2.s),0)
                                   FROM cte c2
                                   WHERE c1.emp_id = c2.emp_id
                                     AND c2.rnk = c1.rnk-1)
  FROM cte c1
)
UPDATE cte2
SET tp_days = tp_days_recalculated;

LiveDemo2

输出：

╔═════════╦════════╦══════════╦═════════╗
║ row_num ║ emp_id ║ date_key ║ tp_days ║
╠═════════╬════════╬══════════╬═════════╣
║       1 ║ U12345 ║ 20131003 ║       2 ║
║       2 ║ U12345 ║ 20131004 ║       3 ║
║       3 ║ U12345 ║ 20131005 ║       4 ║
║       4 ║ U12345 ║ 20131006 ║       3 ║
║       5 ║ U12345 ║ 20150627 ║       4 ║
║       6 ║ U12345 ║ 20150628 ║       5 ║
║       1 ║ U54321 ║ 20131003 ║       2 ║
║       2 ║ U54321 ║ 20131004 ║       3 ║
║       3 ║ U54321 ║ 20131005 ║       1 ║
║       4 ║ U54321 ║ 20131006 ║       2 ║
╚═════════╩════════╩══════════╩═════════╝

it shouldn't change the values 3,4,2 to 1 .... this is the case. I don't need your solution when I have another generic answer, you don't tell me what to do ... thank you

无非是quirky update。是的，它会起作用，但可能很容易失败：

首先没有订购table本身
查询优化器可以以任何方式读取数据（特别是当数据集很大并且涉及并行执行时）。没有 ORDER BY 你不能保证 stable 结果
该行为未记录，今天可能有效，但将来可能会崩溃

Robyn Page's SQL Server Cursor Workbench
Calculate running total / running balance
No Seatbelt - Expecting Order without ORDER BY

Answer 3

我有一个更简单的方法，使用简单的代码如下：

DECLARE @last int=0
UPDATE #Employees set @last=CASE WHEN TP_DAYS=0 THEN @last+1 ELSE TP_DAYS END,
TP_DAYS=CASE WHEN TP_DAYS=0 THEN @last ELSE TP_DAYS END

这在任何 SQL 服务器引擎中运行在此处查看演示

https://data.stackexchange.com/meta.Whosebug/query/422955/sql-update-rows-between-two-values-in-a-column?opt.withExecutionPlan=true#resultSets

SQL - 更新列中两个值之间的行

SQL - Update rows between two values in a column

tsql

sql-server

sql-update

gaps-and-islands