使用 GROUP BY 排序的前 N 行?
Top N sorted rows with GROUP BY?
我有以下 transaction
table:
customer_id, category, product_id, score
我按 customer_id
和 category
分组以创建 product_id-score
地图对列表:
SELECT
s.customer_id,
s.category,
collect_list(s.pair)
FROM
(
SELECT
customer_id,
category,
map(product_id, score) AS pair
FROM
transaction
WHERE
score > {score_threshold}
) s
GROUP BY
s.customer_id,
s.category
现在我想更进一步。对于每个组,我希望只保留顶部的 n
对,按 score
(降序)排序。我尝试 OVER (PARTITION BY...ORDER BY)
我是 运行问题。
注意:transaction
table 被 category
分割
谢谢
试试这个:
SELECT
s.customer_id,
s.category,
collect_list(s.pair)
FROM
(
SELECT
ROW_NUMBER() OVER (PARTITION BY customer_id, category ORDER BY score desc) as RowId
customer_id,
category,
map(product_id, score) AS pair
FROM
transaction
WHERE
score > {score_threshold}
) s
where s.RowId < n
GROUP BY
s.customer_id,
s.category
我有以下 transaction
table:
customer_id, category, product_id, score
我按 customer_id
和 category
分组以创建 product_id-score
地图对列表:
SELECT
s.customer_id,
s.category,
collect_list(s.pair)
FROM
(
SELECT
customer_id,
category,
map(product_id, score) AS pair
FROM
transaction
WHERE
score > {score_threshold}
) s
GROUP BY
s.customer_id,
s.category
现在我想更进一步。对于每个组,我希望只保留顶部的 n
对,按 score
(降序)排序。我尝试 OVER (PARTITION BY...ORDER BY)
我是 运行问题。
注意:transaction
table 被 category
谢谢
试试这个:
SELECT
s.customer_id,
s.category,
collect_list(s.pair)
FROM
(
SELECT
ROW_NUMBER() OVER (PARTITION BY customer_id, category ORDER BY score desc) as RowId
customer_id,
category,
map(product_id, score) AS pair
FROM
transaction
WHERE
score > {score_threshold}
) s
where s.RowId < n
GROUP BY
s.customer_id,
s.category