Select psql 中每组得分最高的 5 个结果

Select top 5 score results per group in psql

我需要一些关于如何 select 前 5 "most critical servers per cutomer" 的帮助。

目前我有一个 select 语句,其中 returns 所有机器的主机名、SystemID、客户和关键值。对于此查询,我下一步需要做的是仅 select 每个客户的前 5 个最关键(关键得分最高)。

我当前的 select 语句如下所示:

SELECT COALESCE(rhss_py_results.Hostname, sid_py_results.Name) AS Hostname, COALESCE(rhss_py_results.SystemID, security_scores.SystemID) AS SystemID, Customer, Critical
FROM rhss_py_results
INNER JOIN sid_py_results
ON rhss_py_results.hostname = sid_py_results.Name
INNER JOIN Customers
ON sid_py_results.SecurityDomain = Customers.SecurityDomain
INNER JOIN security_scores
ON rhss_py_results.SystemID=security_scores.SystemID
ORDER BY Customer;

它 returns 大致如下:(数据因隐私而改变)

     hostname      |  systemid  |         customer         | critical
-------------------+------------+--------------------------+----------
 aaa-aaaa_aaaa     | 1000000024 | Anna                     |       48
 aaa-aaa3-aaa1     | 1000000038 | Anna                     |        5
 aaaaaa001         | 1000000013 | Kalle                    |       10
 aaaaaa002         | 1000000043 | Kalle                    |        1
 aaaaaa005         | 1000000087 | Pelle                    |        5
 bbbbbb0010        | 1000000003 | Pelle                    |        0
 cccccc0001        | 1000000029 | Sara                     |        0
 ddd-dddd-c001     | 1000000063 | Anna                     |       26
 ddd-dddd-c002     | 1000000064 | Anna                     |       24
 ddd-dddd-c003     | 1000000012 | Anna                     |        5
 fff-ffff-f001     | 1000000095 | Anna                     |       13
 gggggg0001        | 1000000077 | Sara                     |        0
 gggggg0002        | 1000000040 | Pelle                    |        0
 gggggg0003        | 1000000039 | Pelle                    |        1
 mmmmmm033         | 1000000047 | Kalle                    |       31
 mmmmmm034         | 1000000045 | Kalle                    |       37
 mmmmmm036         | 1000000046 | Pelle                    |        3
 mmmmmm037         | 1000000082 | Pelle                    |        3
 mmmmmm045         | 1000000091 | Håkan                    |        0

一些客户只有 1 台服务器,其他客户有 15 台,如果客户只有 1 台服务器,则仅列出该服务器就足够了。 如果客户有几台服务器具有相同的临界值并排在前 5 位,则可以根据主机名返回最相关的 5 台服务器。

我有超过 32 个不同的客户,这个数字将来可能会有所不同。

以下结果应该是最终产品(大概):

     hostname      |  systemid  |         customer         | critical
-------------------+------------+--------------------------+----------
 aaa-aaaa_aaaa     | 1000000024 | Anna                     |       48
 ddd-dddd-c001     | 1000000063 | Anna                     |       26
 ddd-dddd-c002     | 1000000064 | Anna                     |       24
 fff-ffff-f001     | 1000000095 | Anna                     |       13
 aaa-aaa3-aaa1     | 1000000038 | Anna                     |        5
 mmmmmm045         | 1000000091 | Håkan                    |        0
 mmmmmm034         | 1000000045 | Kalle                    |       37
 mmmmmm033         | 1000000047 | Kalle                    |       31
 aaaaaa001         | 1000000013 | Kalle                    |       10
 aaaaaa002         | 1000000043 | Kalle                    |        1
 aaaaaa005         | 1000000087 | Pelle                    |        5
 mmmmmm036         | 1000000046 | Pelle                    |        3
 mmmmmm037         | 1000000082 | Pelle                    |        3
 gggggg0003        | 1000000039 | Pelle                    |        1
 bbbbbb0010        | 1000000003 | Pelle                    |        0
 cccccc0001        | 1000000029 | Sara                     |        0
 gggggg0001        | 1000000077 | Sara                     |        0

我已阅读以下文章,但不明白如何将其应用到我自己的查询中,因为我对数据库还很陌生。 http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/

有没有人可以帮助我解决这个问题?

//安布罗斯

按关键描述使用顺序..

SELECT COALESCE(rhss_py_results.Hostname, sid_py_results.Name) AS Hostname, COALESCE(rhss_py_results.SystemID, security_scores.SystemID) AS SystemID, Customer, Critical
FROM rhss_py_results
INNER JOIN sid_py_results
ON rhss_py_results.hostname = sid_py_results.Name
INNER JOIN Customers
ON sid_py_results.SecurityDomain = Customers.SecurityDomain
INNER JOIN security_scores
ON rhss_py_results.SystemID=security_scores.SystemID
ORDER BY Customer,Critical desc;

使用row_number函数

SELECT *
FROM
  (SELECT (row_number() over (partition BY Customer
                             ORDER BY COALESCE(rhss_py_results.SystemID, security_scores.SystemID) DESC)) AS sno,
                       COALESCE(rhss_py_results.Hostname, sid_py_results.Name) AS Hostname,
                       COALESCE(rhss_py_results.SystemID, security_scores.SystemID) AS SystemID,
                       Customer,
                       Critical
   FROM rhss_py_results
   INNER JOIN sid_py_results ON rhss_py_results.hostname = sid_py_results.Name
   INNER JOIN Customers ON sid_py_results.SecurityDomain = Customers.SecurityDomain
   INNER JOIN security_scores ON rhss_py_results.SystemID=security_scores.SystemID
   ORDER BY Customer) AS t
WHERE sno<=5;