Row_number 基于ID和日期
Row_number based on the ID and date
需要 select 基于具有连续日期的 ID 的非活动数据。
Sample Data:
2020-04-19,SQA0199,ACTIVE
2020-04-20,SQA0199,INACTIVE
2020-04-21,SQA0199,INACTIVE
2020-04-22,SQA0199,INACTIVE
2020-04-23,SQA0199,ACTIVE
2020-04-24,SQA0199,INACTIVE
2020-04-25,SQA0199,INACTIVE
2020-04-26,SQA0199,INACTIVE
Sample Script:
SELECT
ROW_NUMBER() OVER (PARTITION BY SQA_ID ORDER BY timestamp) AS "row number",
timestamp, SQA_ID
FROM SQA_SMS_INACTIVE where status='INACTIVE';
Desire Output:
2020-04-20,SQA0199,1
2020-04-21,SQA0199,2
2020-04-22,SQA0199,3
2020-04-24,SQA0199,1
2020-04-25,SQA0199,2
2020-04-26,SQA0199,3
我的脚本的输出在行号中继续计数。请帮我解决这个问题
使用 LAG()
和 SUM()
window 函数,您可以创建行号所基于的行组:
WITH
pre AS (
SELECT *,
DATEDIFF(
timestamp,
LAG(timestamp) OVER (PARTITION BY SQA_ID ORDER BY timestamp)
) <> 1 AS flag
FROM SQA_SMS_INACTIVE
WHERE status = 'INACTIVE'
),
cte AS (
SELECT timestamp, SQA_ID,
SUM(COALESCE(flag, 0) <> 0) OVER (PARTITION BY SQA_ID ORDER BY timestamp) grp
FROM pre
)
SELECT timestamp, SQA_ID,
ROW_NUMBER() OVER (PARTITION BY SQA_ID, grp ORDER BY timestamp) AS `row number`
FROM cte
参见demo。
结果:
| timestamp | SQA_ID | row number |
| ---------- | ------- | ---------- |
| 2020-04-20 | SQA0199 | 1 |
| 2020-04-21 | SQA0199 | 2 |
| 2020-04-22 | SQA0199 | 3 |
| 2020-04-24 | SQA0199 | 1 |
| 2020-04-25 | SQA0199 | 2 |
| 2020-04-26 | SQA0199 | 3 |
需要 select 基于具有连续日期的 ID 的非活动数据。
Sample Data:
2020-04-19,SQA0199,ACTIVE
2020-04-20,SQA0199,INACTIVE
2020-04-21,SQA0199,INACTIVE
2020-04-22,SQA0199,INACTIVE
2020-04-23,SQA0199,ACTIVE
2020-04-24,SQA0199,INACTIVE
2020-04-25,SQA0199,INACTIVE
2020-04-26,SQA0199,INACTIVE
Sample Script:
SELECT
ROW_NUMBER() OVER (PARTITION BY SQA_ID ORDER BY timestamp) AS "row number",
timestamp, SQA_ID
FROM SQA_SMS_INACTIVE where status='INACTIVE';
Desire Output:
2020-04-20,SQA0199,1
2020-04-21,SQA0199,2
2020-04-22,SQA0199,3
2020-04-24,SQA0199,1
2020-04-25,SQA0199,2
2020-04-26,SQA0199,3
我的脚本的输出在行号中继续计数。请帮我解决这个问题
使用 LAG()
和 SUM()
window 函数,您可以创建行号所基于的行组:
WITH
pre AS (
SELECT *,
DATEDIFF(
timestamp,
LAG(timestamp) OVER (PARTITION BY SQA_ID ORDER BY timestamp)
) <> 1 AS flag
FROM SQA_SMS_INACTIVE
WHERE status = 'INACTIVE'
),
cte AS (
SELECT timestamp, SQA_ID,
SUM(COALESCE(flag, 0) <> 0) OVER (PARTITION BY SQA_ID ORDER BY timestamp) grp
FROM pre
)
SELECT timestamp, SQA_ID,
ROW_NUMBER() OVER (PARTITION BY SQA_ID, grp ORDER BY timestamp) AS `row number`
FROM cte
参见demo。
结果:
| timestamp | SQA_ID | row number |
| ---------- | ------- | ---------- |
| 2020-04-20 | SQA0199 | 1 |
| 2020-04-21 | SQA0199 | 2 |
| 2020-04-22 | SQA0199 | 3 |
| 2020-04-24 | SQA0199 | 1 |
| 2020-04-25 | SQA0199 | 2 |
| 2020-04-26 | SQA0199 | 3 |