如果我不能进行索引和分区,如何使这个 MySQL 查询更快地工作?

How to make this MySQL query work faster if I can not do indexing and partition?

我正在围绕 Zabbix MySQL (MariaDB) 数据库构建 Web 应用程序。

我需要显示 table 所有主机名和主机当前问题的列表,我执行以下 sql 查询(在一个 http GET 请求期间我执行了 7 个 sql 使用不同的 events.name 值查询以检查所有可能的问题)这样做:

SELECT distinct(hosts.hostid), max(CONVERT(CONCAT(events.eventid, events.value, events.severity), UNSIGNED))
FROM hosts
INNER JOIN hosts_groups ON hosts.hostid = hosts_groups.hostid
INNER JOIN hstgrp ON hosts_groups.groupid = hstgrp.groupid

INNER JOIN items ON hosts.hostid = items.hostid
INNER JOIN functions ON items.itemid = functions.itemid
INNER JOIN events ON functions.triggerid = events.objectid
WHERE events.name = %s
AND hstgrp.groupid = %s
AND hosts.status != 3 # 3 - not templates
GROUP BY hosts.hostid;

sql 查询的总时间可以从 20 秒到 120 秒不等,我想这个问题与事件的大小 table 和添加新事件的事实有关到 table 真快。

EXPLAIN命令的结果:

我想我可以尝试对 events.name 列进行索引,但恐怕这对 Zabbix 应用程序来说可能是负面因素。另一个选择是分区,但 Zabbix 有自己的分区 howto 计划,所以我也不敢这样做。

我还有哪些其他选择可以使查询工作更快,查询时间差异如此之大(最多 6-7 倍)的原因是什么?

编辑:

如果我限制事件的时间,例如最后 10 天,查询工作得更快,但我丢失了一些事件,因为错误事件可能发生在 1 个月前,而且当时从未解决过。

SELECT hosts.hostid, max(CONVERT(CONCAT(events.eventid, events.value, events.severity), UNSIGNED))
FROM hosts
INNER JOIN hosts_groups ON hosts.hostid = hosts_groups.hostid
INNER JOIN hstgrp ON hosts_groups.groupid = hstgrp.groupid

INNER JOIN items ON hosts.hostid = items.hostid
INNER JOIN functions ON items.itemid = functions.itemid
INNER JOIN events ON functions.triggerid = events.objectid
WHERE events.eventid >= (select eventid from events  where events.clock >= 1602773508 limit 1) AND events.name = "Устройство недоступно"
AND hstgrp.groupid = 15
AND hosts.status != 3 # 3 - not templates
GROUP BY hosts.hostid;

编辑

问题 table 的结果与事件 table 的结果相矛盾,被称为不可访问的主机 table 可以通过 ping 访问并且在 zabbix 接口中未标记为不可访问,查询:

SELECT distinct(hosts.hostid) FROM hosts
INNER JOIN hosts_groups ON hosts.hostid = hosts_groups.hostid
INNER JOIN hstgrp ON hosts_groups.groupid = hstgrp.groupid

INNER JOIN items ON hosts.hostid = items.hostid
INNER JOIN functions ON items.itemid = functions.itemid
INNER JOIN problem ON functions.triggerid = problem.objectid
WHERE problem.name = "Device is unreachable"
AND hstgrp.groupid = 15
AND hosts.status != 3 ;

此外,我发现对于一台主机,存在多个同名但时间(时钟)不同的问题,尽管我预计具体主机的指定名称最多会出现一个问题:

SELECT hosts.hostid, problem.name, problem.clock FROM hosts
INNER JOIN hosts_groups ON hosts.hostid = hosts_groups.hostid
INNER JOIN hstgrp ON hosts_groups.groupid = hstgrp.groupid

INNER JOIN items ON hosts.hostid = items.hostid
INNER JOIN functions ON items.itemid = functions.itemid
INNER JOIN problem ON functions.triggerid = problem.objectid
WHERE problem.name = "Device is unreachable"
AND hstgrp.groupid = 15
AND hosts.status != 3 ;

一台主机的问题结果 table:

10398 设备无法访问 1603625463 10398 设备无法访问 1603630863 10398 设备无法访问 1603661463 10398 设备无法访问 1603679463 10398 设备无法访问 1603697463

如果您无法在 events.name 或 hstgrp.groupid 上创建索引,那您就不走运了。如果您在 events.name 上已有索引,请尝试使用索引提示强制使用该索引,例如JOIN events FORCE INDEX (index_name)。您可能还必须 re-arrange JOIN 顺序以将事件 table 放在前面,并使用 STRAIGHT_JOIN 来防止优化重新排序连接。如果它无济于事(或使事情变得更糟),那么就没有什么可以做的了。

编辑: 试试这个:

将 events(objectid) 的索引扩展到 events(objectid, name) 并将查询更改为:

SELECT distinct(hosts.hostid), 
max(CONVERT(CONCAT(events.eventid, events.value, 
events.severity), UNSIGNED))
FROM hosts
INNER JOIN hosts_groups ON hosts.hostid = hosts_groups.hostid
INNER JOIN hstgrp ON hosts_groups.groupid = hstgrp.groupid 
INNER JOIN items ON hosts.hostid = items.hostid
INNER JOIN functions ON items.itemid = functions.itemid
INNER JOIN events ON functions.triggerid = events.objectid AND events.name = %s
WHERE hstgrp.groupid = %s
AND hosts.status != 3 # 3 - not templates
GROUP BY hosts.hostid;

根据评论回答“建成”。

虽然最好的做法是使用 problem.get API,但从 documentation 中您可以推断出 problem table 的工作原理并且将其用于 SQL 查询:

This method is for retrieving unresolved problems. It is also possible, if specified, to additionally retrieve recently resolved problems. The period that determines how old is “recently” is defined in Administration → General.

Problems that were resolved prior to that period are not kept in the problem table. To retrieve problems that were resolved further back in the past, use the event.get method.

您应该加入 table 而不是 events table,后者包含过去发生的所有事件。