如何在 SQL 服务器中为每个组/类别获取所需的行数
How to get desired number of rows for each group / category in SQL Server
我有这个查询用于从 SQL 服务器 table 检索行:
SELECT
aid,
research_area_category_id,
CAST(research_area as VARCHAR(100)) [research_area],
COUNT(*) [Paper_Count]
FROM
sub_aminer_paper
GROUP BY
aid,
research_area_category_id,
CAST(research_area as VARCHAR(100))
HAVING
aid IN (SELECT
aid
FROM
sub_aminer_paper
GROUP BY
aid
HAVING
MIN(p_year) = 1990 AND MAX(p_year) = 2014 AND COUNT(pid) BETWEEN 10 AND 40
)
ORDER BY aid ASC, Paper_Count DESC
哪个 returns 这个输出:
aid research_area_category_id research_area Paper_Count
2937 33 markov chain 3
2937 33 markov decision process 1
2937 1 optimization problem 1
2937 27 real time application 1
2937 32 software product lines 1
11120 29 aspect oriented programming 4
11120 1 graph cut 2
11120 1 optimization problem 2
11120 32 uml class diagrams 1
11120 25 chinese word segmentation 1
11120 29 dynamic programming 1
11120 19 face recognition 1
11120 1 approximation algorithm 1
12403 2 differential equation 7
12403 1 data structure 2
12403 34 design analysis 1
12403 9 object detection 1
12403 27 operating system 1
12403 1 problem solving 1
12403 21 archiving system 1
12403 2 calculus 1
现在返回的输出包括与各自 aid
相关的所有行,而我只需要每个 aid
ORDER BY Paper_Count
DESC 的前 3 行,即包含的行Paper_Count
3, 1, 1
的值 aid
2937
,4,2,2
的 11120
和 7,2,2
的 12403
。
请帮忙!谢谢。
一种方法是在您的结果集上应用 row_number() over(partition by aid order by Paper_Count desc) as rn
,然后 select 所有具有 rn<=3
的记录
with cte
as
(
SELECT
aid,
research_area_category_id,
CAST(research_area as VARCHAR(100)) [research_area],
COUNT(*) [Paper_Count]
FROM
sub_aminer_paper
GROUP BY
aid,
research_area_category_id,
CAST(research_area as VARCHAR(100))
HAVING
aid IN (SELECT
aid
FROM
sub_aminer_paper
GROUP BY
aid
HAVING
MIN(p_year) = 1990 AND MAX(p_year) = 2014 AND COUNT(pid) BETWEEN 10 AND 40
)
ORDER BY aid ASC, Paper_Count DESC
)
,
cte1
AS
(
SELECT * ,
ROW_NUMBER() OVER (PARTITION BY aid ORDER BY Paper_Count DESC) AS rn
FROM cte
)
SELECT * FROM cte1 WHERE rn<=3
我有这个查询用于从 SQL 服务器 table 检索行:
SELECT
aid,
research_area_category_id,
CAST(research_area as VARCHAR(100)) [research_area],
COUNT(*) [Paper_Count]
FROM
sub_aminer_paper
GROUP BY
aid,
research_area_category_id,
CAST(research_area as VARCHAR(100))
HAVING
aid IN (SELECT
aid
FROM
sub_aminer_paper
GROUP BY
aid
HAVING
MIN(p_year) = 1990 AND MAX(p_year) = 2014 AND COUNT(pid) BETWEEN 10 AND 40
)
ORDER BY aid ASC, Paper_Count DESC
哪个 returns 这个输出:
aid research_area_category_id research_area Paper_Count
2937 33 markov chain 3
2937 33 markov decision process 1
2937 1 optimization problem 1
2937 27 real time application 1
2937 32 software product lines 1
11120 29 aspect oriented programming 4
11120 1 graph cut 2
11120 1 optimization problem 2
11120 32 uml class diagrams 1
11120 25 chinese word segmentation 1
11120 29 dynamic programming 1
11120 19 face recognition 1
11120 1 approximation algorithm 1
12403 2 differential equation 7
12403 1 data structure 2
12403 34 design analysis 1
12403 9 object detection 1
12403 27 operating system 1
12403 1 problem solving 1
12403 21 archiving system 1
12403 2 calculus 1
现在返回的输出包括与各自 aid
相关的所有行,而我只需要每个 aid
ORDER BY Paper_Count
DESC 的前 3 行,即包含的行Paper_Count
3, 1, 1
的值 aid
2937
,4,2,2
的 11120
和 7,2,2
的 12403
。
请帮忙!谢谢。
一种方法是在您的结果集上应用 row_number() over(partition by aid order by Paper_Count desc) as rn
,然后 select 所有具有 rn<=3
with cte
as
(
SELECT
aid,
research_area_category_id,
CAST(research_area as VARCHAR(100)) [research_area],
COUNT(*) [Paper_Count]
FROM
sub_aminer_paper
GROUP BY
aid,
research_area_category_id,
CAST(research_area as VARCHAR(100))
HAVING
aid IN (SELECT
aid
FROM
sub_aminer_paper
GROUP BY
aid
HAVING
MIN(p_year) = 1990 AND MAX(p_year) = 2014 AND COUNT(pid) BETWEEN 10 AND 40
)
ORDER BY aid ASC, Paper_Count DESC
)
,
cte1
AS
(
SELECT * ,
ROW_NUMBER() OVER (PARTITION BY aid ORDER BY Paper_Count DESC) AS rn
FROM cte
)
SELECT * FROM cte1 WHERE rn<=3