获取每列最大计数的行 - 同时按两列分组

Get rows with maximum count per one column - while grouping by two columns

我正在尝试获取字段的最大计数。 这就是我得到的,也是我试图做的。

| col1 | col2 |
|  A   |  B   |
|  A   |  B   |
|  A   |  D   |
|  A   |  D   |
|  A   |  D   |
|  C   |  F   |
|  C   |  G   |
|  C   |  F   |

我正在尝试获取 col2 的最大出现次数,按 col1 分组。

通过此查询,我得到了按 col1col2 分组的事件。

SELECT col1, col2, count(*) as conta 
FROM tab 
WHERE 
GROUP by col1, col2 
ORDER BY col1, col2

然后我得到:

| col1 | col2 | conta |
|  A   |  B   |   2   |
|  A   |  D   |   3   |
|  C   |  F   |   2   |
|  C   |  G   |   1   |

然后我使用这个查询来获取最大计数:

SELECT max(conta) as conta2, col1 
FROM (
    SELECT col1, col2, count(*) as conta 
    FROM tab 
    WHERE 
    GROUP BY col1, col2 
    ORDER BY col1, col2
) AS derivedTable 
GROUP BY col1

然后我得到:

| col1 | conta |
|  A   |   3   |
|  C   |   2   |

我缺少的是 col2 的值。我想要这样的东西:

| col1 | col2 | conta |
|  A   |  D   |   3   |
|  C   |  F   |   2   |

问题是,如果我尝试 select col2 字段,我会收到一条错误消息,我必须在分组依据或聚合函数中使用此字段,但在分组方式不对

我误解了问题。这是您的解决方案:

;with tablex as
    (Select col1, col2, Count(col2) as Count From Your_Table Group by col1, col2),
aaaa as
    (Select ROW_NUMBER() over (partition by col1 order by Count desc) as row, * From tablex)

Select * From aaaa Where row = 1

可能不是最优雅的解决方案,但使用常见的 table 表达式可能会有所帮助。

with cte as (
select col1, col2, count(*) as total
from dtable 
group by col1, col2
)
select  col1, col2, total 
from cte c
where total = (select max(total) 
           from cte cc
           where cc.col1 = c.col1)
order by col1 asc 

Returns

col1|col2|total|
----+----+-----+
 A  | D  |    3|
 C  | F  |    2|

from the docs

您可以将 GROUP BY 与 window 函数结合使用 - 它会在 分组依据后进行评估:

with cte as (
  SELECT col1, col2, 
         count(*) as conta,
         dense_rank() over (partition by col1 order by count(*) desc) as rnk
  FROM tab 
  WHERE ...
  GROUP by col1, col2 
) 
select col1, col2, conta
from cte
where rnk = 1
order by col1, col2;

这将return具有相同最高最大值的 col1,col2 的组合计数两次。如果您不想那样,请使用 row_number() 而不是 dense_rank()

Online example

使用 window 函数:

select distinct on (col1) col1, col2, cnt
from 
(
 select col1, col2, count(*) over (partition by col1, col2) cnt 
 from the_table
) t
order by col1, cnt desc;
col1 col2 cnt
A D 3
C F 2

此解决方案不解决有关系的案例。

更简单、更快(并且正确):

SELECT DISTINCT ON (col1)
       col1, col2, count(*) AS conta
FROM   tab 
GROUP  BY col1, col2 
ORDER  BY col1, conta DESC;

db<>fiddle here(基于a_horse的fiddle)

DISTINCT ON 应用 after 聚合,因此我们不需要子查询或 CTE。考虑 SELECT 查询中的事件序列:

  • Best way to get result count before LIMIT was applied
  • Select first row in each GROUP BY group?