Postgresql - 优化 sql 以根据每个学生的表现选择前 3 个科目

Postgresql - optimize the sql to pick top 3 subjects by performance for every student

我有一个 table 包含学生及其跨科目的分数(那里有对同一科目的多次评估)。我想写一个 sql 来按照下面的要求输出结果。

  1. 应该为每个学生选择前 3 个科目,并为每个科目选择 2 行。
  2. 选择 3 个得分最高的科目,第二行必须来自相同的科目。
  3. 不同科目可以有相同分数,同一科目也可以。如果标记相同,请选择任何一个。
  4. 学生的主题可能不在一起,可能分散在 table,为了便于可视化,我将它们放在一起显示。

Table:

student_id  |   subject     | marks
------------|---------------|--------------
1           | sub-1         | 10
1           | sub-1         | 50
1           | sub-1         | 25
1           | sub-1         | 50

1           | sub-10        | 2
1           | sub-10        | 85
1           | sub-10        | 40

1           | sub-3         | 10
1           | sub-3         | 5
1           | sub-3         | 55
1           | sub-3         | 65
1           | sub-3         | 70

1           | sub-4         | 90
1           | sub-4         | 50
1           | sub-4         | 25

1           | sub-6         | 20
1           | sub-6         | 70
1           | sub-6         | 35
...

要求的结果:

student_id  |   subject     | marks
------------|---------------|--------------
1           | sub-4         | 90
1           | sub-4         | 50
1           | sub-10        | 85
1           | sub-10        | 40
1           | sub-6         | 70
1           | sub-6         | 35

我可以使用下面提到的 sql 来解决它:

with cte as
(
select * from (
select 
dense_rank() over(partition by s.id order by s.marks desc) dense_rank_number,
row_number() over (partition by s.id, s.subject order by marks desc)  row_num,
 s.*
from
(
    select d.id, d.subject, count(*) 
    from student d
    group by d.id, d.subject
    having count(*) >= 2
) t join student s on t.id = s.id and t.subject = s.subject 
order by 1, 2
) t5
where t5.row_num <= 2
),
cte1 as 
(select e.dense_rank_number, e.row_num, 
e.id, min(e.subject) as subject, e.marks from cte e
where e.row_num = 1 and e.dense_rank_number <= 3
group by e.id, e.row_num, e.marks, e.dense_rank_number

),
cte2 as 
( 
    select cte.* 
    from cte, cte1 
    where 
    cte.id = cte1.id 
    and cte.subject = cte1.subject 
    and cte.row_num != cte1.row_num 
)
select * from cte1
union
select * from cte2
;

有没有更好的写法sql? 演示可以在这里找到: https://dbfiddle.uk/?rdbms=postgres_12&fiddle=4d9192b995884d5742977d13e4bbe68d

如果我没理解错的话,你想要每个学生最高分的三个科目。然后你想要那个科目的两个最高分。如果是这样,我建议:

select s.*
from (select s.*,
             dense_rank() over (partition by id order by max_marks desc, subject) as seqnum_s
      from (select s.*,
                   row_number() over (partition by id, subject order by marks desc) as seqnum,
                   max(marks) over (partition by id, subject) as max_marks
            from student s
           ) s
      where seqnum <= 2
     ) s
where seqnum_s <= 3
order by s.id, max_marks, subject, marks desc;