根据 mySQL 中的 COUNT() 值限制 GROUP BY
Limiting GROUP BY based on COUNT() values in mySQL
我正在将事件记录到 mySQL 数据库中,并希望获取前 3 个事件以用于监控目的。
我的 table eventlog
看起来像这样:
+----+------------------+---------------------+
| id | eventname | eventdate |
+----+------------------+---------------------+
| 0 | machine1.started | 2016-09-04 19:22:23 |
| 1 | machine2.reboot | 2016-09-04 20:23:11 |
| 2 | machine1.stopped | 2016-09-04 20:24:12 |
| 3 | machine1.started | 2016-09-04 20:25:12 |
| 4 | machine1.stopped | 2016-09-04 23:23:16 |
| 5 | machine0.started | 2016-09-04 23:24:00 |
| 6 | machine1.started | 2016-09-04 23:24:16 |
| 7 | machine3.started | 2016-09-04 23:25:00 |
| 8 | machine4.started | 2016-09-04 23:26:00 |
| 9 | cluster.alive | 2016-09-04 23:30:00 |
| 10 | cluster.alive | 2016-09-05 11:30:00 |
+----+------------------+---------------------+
查询最终应该return以下,持有
- 最常发生的前 3 个事件(基于 mySQL 的
COUNT()
函数生成的 eventcount
列),按其 eventname
分组
- 只有 2 行,其中
eventcount
= 1,但前提是 1 在前 3 行中 eventcounts
(因为有很多事件仅发生
一次,因此会使我的前端超载)
期望结果的示例,基于以上 table:
+------------+------------------+
| eventcount | eventname |
+------------+------------------+
| 3 | machine1.started |
| 2 | machine1.stopped |
| 2 | cluster.alive |
| 1 | machine0.started |
| 1 | machine2.started |
+------------+------------------+
请注意,我不仅需要 3 returned 行,还需要 3 个最高 eventcount
s 的行。
我通过弄乱下面的查询字符串做了很多试验,包括多个选择和有问题的 CASE ... WHEN
条件,但无法使其按我需要的方式工作。
SELECT COUNT(id) AS 'eventcount', eventname
FROM eventlog
GROUP BY eventname
ORDER BY eventcount DESC;
以高效方式获得预期结果的最佳方法是什么?
MySQL 中的这些类型的情况很痛苦。一种方法使用变量。这是一个没有的方法:
SELECT el.eventcount, el.eventname
FROM (SELECT COUNT(el.id) AS eventcount, el.eventname
FROM eventlog el
GROUP BY el.eventname
) el JOIN
(SELECT cnt
FROM (SELECT DISTINCT COUNT(el.id) as cnt
FROM eventlog el
GROUP BY el.eventname
) el
ORDER BY cnt DESC
LIMIT 3
) ell
ON ell.cnt = el.eventcount
ORDER BY el.eventcount DESC;
编辑:
使用变量的解决方案如下所示,其中包括对 2 的限制以达到 1 的计数:
SELECT *
FROM (SELECT e.*,
(@rn1 := if(@c1 = eventcount, @rn1 + 1,
if(@c1 := eventcount, 1, 1)
)
) as rn
FROM (SELECT e.*,
(@rn := if(@c = eventcount, @rn,
if(@c := eventcount, @rn + 1, @rn + 1)
)
) as rank
FROM (SELECT COUNT(el.id) AS eventcount, el.eventname
FROM eventlog el
GROUP BY el.eventname
) e CROSS JOIN
(SELECT @c := 0, @rn := 0) params
ORDER BY eventcount DESC
) e CROSS JOIN
(SELECT @c1 := 0, @rn1 := 0) params
ORDER BY eventcount DESC
) e
WHERE rank <= 3 AND
(eventcount > 1 OR rn <= 2);
最里面的计数枚举计数。第二个在一个计数内枚举。实际上,两者可以合并成一个子查询,但要小心。
你可以试试这个:
SELECT count(eventname), eventname FROM table
group by eventname
HAVING(count(eventname)) > 1
order by count(eventname) DESC
limit 3
这是使用变量的一种方法
SQL Fiddle 为:http://sqlfiddle.com/#!9/b3458b/16
SELECT
t2.eventcount
,t2.eventname
FROM
(
SELECT
t.eventname
,t.eventcount
,@Rank:=IF(@PrevCount=t.eventcount,@Rank,@Rank+1) Rank
,@CountRownum:=IF(@PrevCount=t.eventcount,@CountRowNum + 1,1) CountRowNum
,@PrevCount:= t.eventcount
FROM
(
SELECT
l.eventname
,COUNT(*) as eventcount
FROM
eventlog l
GROUP BY
l.eventname
ORDER BY
COUNT(*) DESC
) t
CROSS JOIN (SELECT @Rank:=0, @CountRowNum:=0, @PrevCount:=-1) var
ORDER BY
t.eventcount DESC
) t2
WHERE
t2.Rank < 4
AND NOT (t2.eventcount = 1 AND t2.CountRowNum > 2)
这应该可以重构一下,但 returns 目前的正确答案是:
SELECT eventcount, eventname
FROM
(SELECT el.eventcount, el.eventname
FROM (SELECT COUNT(el.id) AS eventcount, el.eventname
FROM eventlog el
GROUP BY el.eventname
) el JOIN
(SELECT counts
FROM (SELECT DISTINCT COUNT(el.id) as counts
FROM eventlog el
GROUP BY el.eventname
) el
ORDER BY counts DESC
LIMIT 3
) el2
ON el2.counts = el.eventcount
WHERE el.eventcount != 1
UNION ALL
(SELECT el.eventcount, el.eventname
FROM (SELECT COUNT(el.id) AS eventcount, el.eventname
FROM eventlog el
GROUP BY el.eventname
) el JOIN
(SELECT counts
FROM (SELECT DISTINCT COUNT(el.id) as counts
FROM eventlog el
GROUP BY el.eventname
) el
ORDER BY counts DESC
LIMIT 3
) el2
ON el2.counts = el.eventcount AND el2.counts = 1
LIMIT 2)) tmp
ORDER BY tmp.eventcount DESC;
SQL Fiddle: http://sqlfiddle.com/#!9/10f0d/92
如果你可以使用临时 tables..
预先计算事件计数并将结果存储在临时 table:
create temporary table tmp_eventcounts
select eventname, count(1) as eventcount
from eventlog
group by eventname
order by eventcount desc
;
tmp_eventcounts
的内容:
| eventname | eventcount |
|------------------|------------|
| machine1.started | 3 |
| machine1.stopped | 2 |
| cluster.alive | 2 |
| machine3.started | 1 |
| machine2.reboot | 1 |
| machine4.started | 1 |
| machine0.started | 1 |
Select 前 3 个事件计数并将它们存储在另一个临时文件中 table:
create temporary table tmp_top3counts
select distinct eventcount
from tmp_eventcounts
order by eventcount desc
limit 3
;
tmp_top3counts
的内容:
| eventcount |
|------------|
| 3 |
| 2 |
| 1 |
现在 select 所有具有前 3 个事件计数但 eventcount
> 1 的事件名称。
另外 select 最多两个具有前 3 个事件计数但 eventcount
= 1 的事件名称。
使用 UNION 合并两个结果:
select eventcount, eventname
from tmp_top3counts
join tmp_eventcounts using(eventcount)
where eventcount > 1
union all (
select eventcount, eventname
from tmp_top3counts
join tmp_eventcounts using(eventcount)
where eventcount = 1
limit 2
)
order by eventcount desc;
结果:
| eventcount | eventname |
|------------|------------------|
| 3 | machine1.started |
| 2 | machine1.stopped |
| 2 | cluster.alive |
| 1 | machine2.reboot |
| 1 | machine3.started |
http://sqlfiddle.com/#!9/b332df/1
如果您不能使用临时 tables,您可以用它们的定义替换它们的出现,并创建一个高度不可读但有效的查询:
select eventcount, eventname
from (
select distinct eventcount
from (
select eventname, count(1) as eventcount
from eventlog
group by eventname
) tmp_eventcounts
order by eventcount desc
limit 3
) tmp_top3counts
join (
select eventname, count(1) as eventcount
from eventlog
group by eventname
) tmp_eventcounts using(eventcount)
where eventcount > 1
union all (
select eventcount, eventname
from (
select distinct eventcount
from (
select eventname, count(1) as eventcount
from eventlog
group by eventname
) tmp_eventcounts
order by eventcount desc
limit 3
) tmp_top3counts
join (
select eventname, count(1) as eventcount
from eventlog
group by eventname
) tmp_eventcounts using(eventcount)
where eventcount = 1
limit 2
)
order by eventcount desc;
http://sqlfiddle.com/#!9/2eea6/4;-)
虽然这看起来很疯狂,但可以在 PHP:
中轻松创建
$tmp_eventcounts = "
select eventname, count(1) as eventcount
from eventlog
group by eventname
";
$tmp_top3counts = "
select distinct eventcount
from ( {$tmp_eventcounts} ) tmp_eventcounts
order by eventcount desc
limit 3
";
$sql = "
select eventcount, eventname
from ( {$tmp_top3counts} ) tmp_top3counts
join ( {$tmp_eventcounts} ) tmp_eventcounts using(eventcount)
where eventcount > 1
union all (
select eventcount, eventname
from ( {$tmp_top3counts} ) tmp_top3counts
join ( {$tmp_eventcounts} ) tmp_eventcounts using(eventcount)
where eventcount = 1
limit 2
)
order by eventcount desc
";
注意:看起来 MySQL 需要一次又一次地执行相同的子查询。但它应该能够缓存结果并重用它们。
我正在将事件记录到 mySQL 数据库中,并希望获取前 3 个事件以用于监控目的。
我的 table eventlog
看起来像这样:
+----+------------------+---------------------+
| id | eventname | eventdate |
+----+------------------+---------------------+
| 0 | machine1.started | 2016-09-04 19:22:23 |
| 1 | machine2.reboot | 2016-09-04 20:23:11 |
| 2 | machine1.stopped | 2016-09-04 20:24:12 |
| 3 | machine1.started | 2016-09-04 20:25:12 |
| 4 | machine1.stopped | 2016-09-04 23:23:16 |
| 5 | machine0.started | 2016-09-04 23:24:00 |
| 6 | machine1.started | 2016-09-04 23:24:16 |
| 7 | machine3.started | 2016-09-04 23:25:00 |
| 8 | machine4.started | 2016-09-04 23:26:00 |
| 9 | cluster.alive | 2016-09-04 23:30:00 |
| 10 | cluster.alive | 2016-09-05 11:30:00 |
+----+------------------+---------------------+
查询最终应该return以下,持有
- 最常发生的前 3 个事件(基于 mySQL 的
COUNT()
函数生成的eventcount
列),按其eventname
分组 - 只有 2 行,其中
eventcount
= 1,但前提是 1 在前 3 行中eventcounts
(因为有很多事件仅发生 一次,因此会使我的前端超载)
期望结果的示例,基于以上 table:
+------------+------------------+
| eventcount | eventname |
+------------+------------------+
| 3 | machine1.started |
| 2 | machine1.stopped |
| 2 | cluster.alive |
| 1 | machine0.started |
| 1 | machine2.started |
+------------+------------------+
请注意,我不仅需要 3 returned 行,还需要 3 个最高 eventcount
s 的行。
我通过弄乱下面的查询字符串做了很多试验,包括多个选择和有问题的 CASE ... WHEN
条件,但无法使其按我需要的方式工作。
SELECT COUNT(id) AS 'eventcount', eventname
FROM eventlog
GROUP BY eventname
ORDER BY eventcount DESC;
以高效方式获得预期结果的最佳方法是什么?
MySQL 中的这些类型的情况很痛苦。一种方法使用变量。这是一个没有的方法:
SELECT el.eventcount, el.eventname
FROM (SELECT COUNT(el.id) AS eventcount, el.eventname
FROM eventlog el
GROUP BY el.eventname
) el JOIN
(SELECT cnt
FROM (SELECT DISTINCT COUNT(el.id) as cnt
FROM eventlog el
GROUP BY el.eventname
) el
ORDER BY cnt DESC
LIMIT 3
) ell
ON ell.cnt = el.eventcount
ORDER BY el.eventcount DESC;
编辑:
使用变量的解决方案如下所示,其中包括对 2 的限制以达到 1 的计数:
SELECT *
FROM (SELECT e.*,
(@rn1 := if(@c1 = eventcount, @rn1 + 1,
if(@c1 := eventcount, 1, 1)
)
) as rn
FROM (SELECT e.*,
(@rn := if(@c = eventcount, @rn,
if(@c := eventcount, @rn + 1, @rn + 1)
)
) as rank
FROM (SELECT COUNT(el.id) AS eventcount, el.eventname
FROM eventlog el
GROUP BY el.eventname
) e CROSS JOIN
(SELECT @c := 0, @rn := 0) params
ORDER BY eventcount DESC
) e CROSS JOIN
(SELECT @c1 := 0, @rn1 := 0) params
ORDER BY eventcount DESC
) e
WHERE rank <= 3 AND
(eventcount > 1 OR rn <= 2);
最里面的计数枚举计数。第二个在一个计数内枚举。实际上,两者可以合并成一个子查询,但要小心。
你可以试试这个:
SELECT count(eventname), eventname FROM table
group by eventname
HAVING(count(eventname)) > 1
order by count(eventname) DESC
limit 3
这是使用变量的一种方法 SQL Fiddle 为:http://sqlfiddle.com/#!9/b3458b/16
SELECT
t2.eventcount
,t2.eventname
FROM
(
SELECT
t.eventname
,t.eventcount
,@Rank:=IF(@PrevCount=t.eventcount,@Rank,@Rank+1) Rank
,@CountRownum:=IF(@PrevCount=t.eventcount,@CountRowNum + 1,1) CountRowNum
,@PrevCount:= t.eventcount
FROM
(
SELECT
l.eventname
,COUNT(*) as eventcount
FROM
eventlog l
GROUP BY
l.eventname
ORDER BY
COUNT(*) DESC
) t
CROSS JOIN (SELECT @Rank:=0, @CountRowNum:=0, @PrevCount:=-1) var
ORDER BY
t.eventcount DESC
) t2
WHERE
t2.Rank < 4
AND NOT (t2.eventcount = 1 AND t2.CountRowNum > 2)
这应该可以重构一下,但 returns 目前的正确答案是:
SELECT eventcount, eventname
FROM
(SELECT el.eventcount, el.eventname
FROM (SELECT COUNT(el.id) AS eventcount, el.eventname
FROM eventlog el
GROUP BY el.eventname
) el JOIN
(SELECT counts
FROM (SELECT DISTINCT COUNT(el.id) as counts
FROM eventlog el
GROUP BY el.eventname
) el
ORDER BY counts DESC
LIMIT 3
) el2
ON el2.counts = el.eventcount
WHERE el.eventcount != 1
UNION ALL
(SELECT el.eventcount, el.eventname
FROM (SELECT COUNT(el.id) AS eventcount, el.eventname
FROM eventlog el
GROUP BY el.eventname
) el JOIN
(SELECT counts
FROM (SELECT DISTINCT COUNT(el.id) as counts
FROM eventlog el
GROUP BY el.eventname
) el
ORDER BY counts DESC
LIMIT 3
) el2
ON el2.counts = el.eventcount AND el2.counts = 1
LIMIT 2)) tmp
ORDER BY tmp.eventcount DESC;
SQL Fiddle: http://sqlfiddle.com/#!9/10f0d/92
如果你可以使用临时 tables..
预先计算事件计数并将结果存储在临时 table:
create temporary table tmp_eventcounts
select eventname, count(1) as eventcount
from eventlog
group by eventname
order by eventcount desc
;
tmp_eventcounts
的内容:
| eventname | eventcount |
|------------------|------------|
| machine1.started | 3 |
| machine1.stopped | 2 |
| cluster.alive | 2 |
| machine3.started | 1 |
| machine2.reboot | 1 |
| machine4.started | 1 |
| machine0.started | 1 |
Select 前 3 个事件计数并将它们存储在另一个临时文件中 table:
create temporary table tmp_top3counts
select distinct eventcount
from tmp_eventcounts
order by eventcount desc
limit 3
;
tmp_top3counts
的内容:
| eventcount |
|------------|
| 3 |
| 2 |
| 1 |
现在 select 所有具有前 3 个事件计数但 eventcount
> 1 的事件名称。
另外 select 最多两个具有前 3 个事件计数但 eventcount
= 1 的事件名称。
使用 UNION 合并两个结果:
select eventcount, eventname
from tmp_top3counts
join tmp_eventcounts using(eventcount)
where eventcount > 1
union all (
select eventcount, eventname
from tmp_top3counts
join tmp_eventcounts using(eventcount)
where eventcount = 1
limit 2
)
order by eventcount desc;
结果:
| eventcount | eventname |
|------------|------------------|
| 3 | machine1.started |
| 2 | machine1.stopped |
| 2 | cluster.alive |
| 1 | machine2.reboot |
| 1 | machine3.started |
http://sqlfiddle.com/#!9/b332df/1
如果您不能使用临时 tables,您可以用它们的定义替换它们的出现,并创建一个高度不可读但有效的查询:
select eventcount, eventname
from (
select distinct eventcount
from (
select eventname, count(1) as eventcount
from eventlog
group by eventname
) tmp_eventcounts
order by eventcount desc
limit 3
) tmp_top3counts
join (
select eventname, count(1) as eventcount
from eventlog
group by eventname
) tmp_eventcounts using(eventcount)
where eventcount > 1
union all (
select eventcount, eventname
from (
select distinct eventcount
from (
select eventname, count(1) as eventcount
from eventlog
group by eventname
) tmp_eventcounts
order by eventcount desc
limit 3
) tmp_top3counts
join (
select eventname, count(1) as eventcount
from eventlog
group by eventname
) tmp_eventcounts using(eventcount)
where eventcount = 1
limit 2
)
order by eventcount desc;
http://sqlfiddle.com/#!9/2eea6/4;-)
虽然这看起来很疯狂,但可以在 PHP:
中轻松创建$tmp_eventcounts = "
select eventname, count(1) as eventcount
from eventlog
group by eventname
";
$tmp_top3counts = "
select distinct eventcount
from ( {$tmp_eventcounts} ) tmp_eventcounts
order by eventcount desc
limit 3
";
$sql = "
select eventcount, eventname
from ( {$tmp_top3counts} ) tmp_top3counts
join ( {$tmp_eventcounts} ) tmp_eventcounts using(eventcount)
where eventcount > 1
union all (
select eventcount, eventname
from ( {$tmp_top3counts} ) tmp_top3counts
join ( {$tmp_eventcounts} ) tmp_eventcounts using(eventcount)
where eventcount = 1
limit 2
)
order by eventcount desc
";
注意:看起来 MySQL 需要一次又一次地执行相同的子查询。但它应该能够缓存结果并重用它们。