值的连续出现
Consecutive Occurrences of Values
我必须按学生缺勤日期计算连续 次。
在两个类别中:2 次连续 fouls/absences 和 3 次或更多次连续犯规。
数据示例:
Name
Date
Present
Student 1
2022/01/01
false
Student 1
2022/01/02
false
Student 1
2022/01/03
true
Student 1
2022/01/04
false
Student 1
2022/01/05
false
Student 1
2022/01/06
false
Student 1
2022/01/07
true
Student 1
2022/01/08
false
Student 1
2022/01/09
false
Student 1
2022/01/10
false
Student 1
2022/01/11
false
Student 1
2022/01/12
true
Student 1
2022/01/13
false
Student 1
2022/01/14
false
Student 1
2022/01/15
true
预期结果:
Students
Count 2 Consecutive Absences
Count 3 consecutives or more
Total of Absences
Student 1
2
2
11
我尝试使用 LAG 和 OVER 执行此代码,但没有成功。
CASE WHEN LAG(present) OVER (order by date) is false AND present is false THEN 1 END as test
考虑以下方法
select name,
countif(absences = 2) as Count_2_Consecutive_Absences,
countif(absences > 2) as Count_3_or_more_Consecutive_Absences,
sum(absences) as Total_Absences,
from (
select name, countif(not present) absences
from (
select *, countif(new_grp) over(partition by name order by date) grp
from (
select *, ifnull(present != lag(present) over(partition by name order by date), true) new_grp
from your_table
)
)
group by name, grp
having absences > 0
)
group by name
如果应用于您问题中的示例数据 - 输出为
这是一个间隙和孤岛问题,一旦方法使用行数差异方法:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date) rn1,
ROW_NUMBER() OVER (PARTITION BY Name, Present ORDER BY Date) rn2
FROM yourTable
),
cte2 AS (
SELECT Name,
COUNT(CASE WHEN Present = false THEN 1 END) AS num_consec_absent
FROM cte
GROUP BY Name, rn1 - rn2
)
SELECT Name,
COUNT(CASE WHEN num_consec_absent = 2
THEN 1 END) AS Count_2_Consecutive_Absences,
COUNT(CASE WHEN num_consec_absent > 2
THEN 1 END) AS Count_3_or_more_Consecutive_Absences,
SUM(num_consec_absent) AS Total_Absences
FROM cte2
GROUP BY Name;
这是 SQL 服务器的 运行 demo。
我必须按学生缺勤日期计算连续 次。
在两个类别中:2 次连续 fouls/absences 和 3 次或更多次连续犯规。
数据示例:
Name | Date | Present |
---|---|---|
Student 1 | 2022/01/01 | false |
Student 1 | 2022/01/02 | false |
Student 1 | 2022/01/03 | true |
Student 1 | 2022/01/04 | false |
Student 1 | 2022/01/05 | false |
Student 1 | 2022/01/06 | false |
Student 1 | 2022/01/07 | true |
Student 1 | 2022/01/08 | false |
Student 1 | 2022/01/09 | false |
Student 1 | 2022/01/10 | false |
Student 1 | 2022/01/11 | false |
Student 1 | 2022/01/12 | true |
Student 1 | 2022/01/13 | false |
Student 1 | 2022/01/14 | false |
Student 1 | 2022/01/15 | true |
预期结果:
Students | Count 2 Consecutive Absences | Count 3 consecutives or more | Total of Absences |
---|---|---|---|
Student 1 | 2 | 2 | 11 |
我尝试使用 LAG 和 OVER 执行此代码,但没有成功。
CASE WHEN LAG(present) OVER (order by date) is false AND present is false THEN 1 END as test
考虑以下方法
select name,
countif(absences = 2) as Count_2_Consecutive_Absences,
countif(absences > 2) as Count_3_or_more_Consecutive_Absences,
sum(absences) as Total_Absences,
from (
select name, countif(not present) absences
from (
select *, countif(new_grp) over(partition by name order by date) grp
from (
select *, ifnull(present != lag(present) over(partition by name order by date), true) new_grp
from your_table
)
)
group by name, grp
having absences > 0
)
group by name
如果应用于您问题中的示例数据 - 输出为
这是一个间隙和孤岛问题,一旦方法使用行数差异方法:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date) rn1,
ROW_NUMBER() OVER (PARTITION BY Name, Present ORDER BY Date) rn2
FROM yourTable
),
cte2 AS (
SELECT Name,
COUNT(CASE WHEN Present = false THEN 1 END) AS num_consec_absent
FROM cte
GROUP BY Name, rn1 - rn2
)
SELECT Name,
COUNT(CASE WHEN num_consec_absent = 2
THEN 1 END) AS Count_2_Consecutive_Absences,
COUNT(CASE WHEN num_consec_absent > 2
THEN 1 END) AS Count_3_or_more_Consecutive_Absences,
SUM(num_consec_absent) AS Total_Absences
FROM cte2
GROUP BY Name;
这是 SQL 服务器的 运行 demo。