Select 来自每个分区的不同值 over()

Select distinct values from each partition over()

我有一个查询,我想根据难度级别选择不同类别的问题。

有些问题与其他问题类似(我将它们的链接存储在名为 "bucket" 的字段中)。

现在,我想要的是从一个桶中只返回 1 个问题。

我正在尝试的查询是:

select *
            from (
                select distinct q.bucket,
                    row_number() over (partition by dl.value order by random()) as rn,
                    dense_rank() over (partition by dl.value, LOWER(qc.value) = LOWER('general') order by random()) as rnc,
                    dl.value, qc.value as question_category,
                    q.question_text, q.option_a, q.option_b, q.option_c, q.option_d,
                    q.correct_answer, q.image_link, q.question_type
                from
                    questions_bank q
                    inner join
                    question_category qc on qc.id = q.question_category_id
                    inner join
                    sports_type st on st.id = q.sports_type_id
                    inner join
                    difficulty_level dl on dl.id = q.difficulty_level_id
                where st.game_type = lower('cricket') and dl.value in ('E','M','H')
            ) s
            where
                (value = 'E' and rnc <= 6 and LOWER(question_category) != LOWER('general')) or
                (value = 'E' and rnc <= 6 and LOWER(question_category) = LOWER('general')) or
                value = 'M' and rn <= 0 or
                value = 'H' and rn <= 0;

这没有返回所需的输出。

相同的输出是:

bucket | rn | rnc | value | question_category | question_text | option_a | option_b | option_c | option_d | correct_answer |                 image_link                  | question_type 

  2 |  2 |   2 | E     | General           | abs           | a        | b        | c        | d        | option_a       | https://d1ugevkr3ygvej.cloudfront.net/2.png | i
  3 |  3 |   3 | E     | General           | abcd          | a        | b        | c        | d        | option_a       | https://d1ugevkr3ygvej.cloudfront.net/3.png | i
  3 |  4 |   4 | E     | General           | abs           | a        | b        | c        | d        | option_a       |                                             | t
  4 |  1 |   1 | E     | General           | image         | a        | b        | c        | d        | option_a       |                                             | t

如果您注意到,存储桶值包含 3 个重复值。我不希望 row_number 和 bucket 的组合是不同的。应优先考虑存储桶,然后计算行号,但分区应基于 question_category 值。

我该怎么做?

解决办法不是用DISTINCT,而是

SELECT DISTINCT ON (q.bucket) ...

the documentation

这将 return 每个 q.bucket 只有一行,如果您向查询添加 ORDER BY 子句,它将按该顺序选择第一行(否则您我将获得“最佳”行)。