SQL 服务器根据聚合函数的结果加入

SQL Server join on result of aggregate function

SQL 服务器 2017。 我有 2 个表,其中包含黑客名称和 ID,以及每个表提交的另一个编码挑战(如下)。我需要输出挑战的id、名称和数量,过滤掉那些提交相同数量挑战的黑客,除非这个数字是最大值。
这是我需要的示例数据和最终输出

黑客:

hacker_id name
1 john
2 tom
3 anna
4 mary
5 steve

挑战:

challenge_id    hacker_id
1   1
2   1
3   1
4   2
5   2
6   2
7   2
8   3
9   3
10  3
11  4
12  4
13  4
14  4
15  5
16  5

这是每人的挑战次数(从这里,我们看到每人最多挑战 4 次):

hacker_id   name    count of challenges
1   john    3
2   tom     4
3   anna    3
4   mary    4
5   steve   2

最终输出如下:

hacker_id   name    count of challenges
2   tom     4
4   mary    4
5   steve   2

即汤姆和玛丽都提交了 4 个挑战。它们被包括在内是因为虽然数字 4 重复,但它是最大值 约翰和安娜各提交了 3 个。他们被排除在外,因为 3 不是每人的最大值。史蒂夫提交了 2 个,这个数字是唯一的,所以他也包括在内。

这是我的代码:

SELECT h.hacker_id, 
h.name, 
COUNT(c.challenge_id) AS ChalCountPerHead
FROM    hackers h 
        JOIN challenges c ON h.hacker_id = c.hacker_id
        LEFT JOIN ( 
            SELECT d.FreqHacker, COUNT(d.FreqHacker) as FreqOfFreq FROM 
                (SELECT hacker_id, COUNT(challenge_id) AS FreqHacker 
                 FROM Challenges GROUP BY hacker_id) d
            GROUP BY d.FreqHacker
        ) dd
        ON FreqHacker = COUNT(c.challenge_id)
GROUP BY h.hacker_id, h.name
HAVING 
COUNT(c.challenge_id) = (SELECT MAX(d.FreqHacker) from d) 
OR
dd.FreqOfFreq = 1

这行不通,在这一行显示错误信息

ON FreqHacker = COUNT(c.challenge_id)

An aggregate cannot appear in an ON clause unless it is in a sub query contained in a HAVING clause or select list, and the column being aggregated is an outer reference.

这是一种方法。

问题中包含示例数据可以更轻松地验证解决方案。请下次加入。

CTE是一个简单的聚合,用于获取每个黑客提交的挑战数量。

CTE2MAX给出了全局最大频率。 HackerCountOfSameFreq 是相同频率的黑客数量。

Final WHERE 删除由超过 1 个黑客组成的黑客组,但留下频率最高的组。

示例数据

DECLARE @Hackers TABLE (hacker_id int, name varchar(50));
INSERT INTO @Hackers VALUES
(1, 'john'),
(2, 'tom'),
(3, 'anna'),
(4, 'mary'),
(5, 'steve');

DECLARE @Challenges TABLE (challenge_id int, hacker_id int);
INSERT INTO @Challenges VALUES
(1 ,  1),
(2 ,  1),
(3 ,  1),
(4 ,  2),
(5 ,  2),
(6 ,  2),
(7 ,  2),
(8 ,  3),
(9 ,  3),
(10,  3),
(11,  4),
(12,  4),
(13,  4),
(14,  4),
(15,  5),
(16,  5);

查询

WITH
CTE
AS
(
    SELECT hacker_id, COUNT(*) AS FreqHacker
    FROM @Challenges
    GROUP BY hacker_id
)
,CTE2
AS
(
    SELECT
        hacker_id
        ,FreqHacker
        ,COUNT(*) OVER (PARTITION BY FreqHacker) AS HackerCountOfSameFreq
        ,MAX(FreqHacker) OVER () AS GlobalMaxFreq
    FROM CTE
)

SELECT
    CTE2.hacker_id
    ,CTE2.FreqHacker
    ,H.Name
FROM
    CTE2
    INNER JOIN @Hackers AS H ON H.hacker_id = CTE2.hacker_id
WHERE
    HackerCountOfSameFreq = 1
    OR FreqHacker = GlobalMaxFreq
ORDER BY
    CTE2.hacker_id
;

结果

+-----------+------------+-------+
| hacker_id | FreqHacker | Name  |
+-----------+------------+-------+
| 2         | 4          | tom   |
+-----------+------------+-------+
| 4         | 4          | mary  |
+-----------+------------+-------+
| 5         | 2          | steve |
+-----------+------------+-------+

一旦语法固定,您的查询也会产生正确的结果(至少对于这个示例数据)。

我已将其拆分为 CTE,您的大部分逻辑保持原样:

WITH
d
AS
(
    SELECT hacker_id, COUNT(challenge_id) AS FreqHacker 
    FROM @Challenges 
    GROUP BY hacker_id
)
,dd
AS
(
    SELECT d.FreqHacker, COUNT(d.FreqHacker) as FreqOfFreq 
    FROM d
    GROUP BY d.FreqHacker
)
,d3
AS
(
    SELECT 
        h.hacker_id, 
        h.name, 
        COUNT(c.challenge_id) AS ChalCountPerHead
    FROM    
        @hackers h 
        JOIN @challenges c ON h.hacker_id = c.hacker_id
    GROUP BY h.hacker_id, h.name
)
,d4
AS
(
    SELECT *
    FROM 
        d3
        LEFT JOIN dd ON dd.FreqHacker = ChalCountPerHead
)
SELECT *
FROM d4
WHERE
    ChalCountPerHead = (SELECT MAX(d.FreqHacker) from d)
    OR FreqOfFreq = 1
ORDER BY hacker_id
;