MySQL - 在提高查询性能方面需要帮助
MySQL - need help in improving query performance
最初的问题是基于哪里最好将 tx 隔离设置为 READ UNCOMMITTED,但在一些建议之后,我最初的想法似乎是不正确的。
DDL
CREATE TABLE `tblgpslog` (
`GPSLogID` BIGINT(20) NOT NULL AUTO_INCREMENT,
`DTSaved` DATETIME NULL DEFAULT NULL,
`PrimaryAssetID` BIGINT(20) NULL DEFAULT NULL,
`SecondaryAssetID` BIGINT(20) NULL DEFAULT NULL,
`ThirdAssetID` BIGINT(20) NULL DEFAULT NULL,
`JourneyType` CHAR(1) NOT NULL DEFAULT 'B',
`DateStamp` DATETIME NULL DEFAULT NULL,
`Status` VARCHAR(50) NULL DEFAULT NULL,
`Location` VARCHAR(255) NULL DEFAULT '',
`Latitude` DECIMAL(11,8) NULL DEFAULT NULL,
`Longitude` DECIMAL(11,8) NULL DEFAULT NULL,
`GPSFix` CHAR(2) NULL DEFAULT NULL,
`Speed` BIGINT(20) NULL DEFAULT NULL,
`Heading` INT(11) NULL DEFAULT NULL,
`LifeOdometer` BIGINT(20) NULL DEFAULT NULL,
`Extra` VARCHAR(20) NULL DEFAULT NULL,
`BatteryLevel` VARCHAR(5) NULL DEFAULT '--',
`Ignition` TINYINT(4) NOT NULL DEFAULT '1',
`Radius` INT(11) NOT NULL DEFAULT '0',
`GSMLatitude` DECIMAL(11,8) NOT NULL DEFAULT '0.00000000',
`GSMLongitude` DECIMAL(11,8) NOT NULL DEFAULT '0.00000000',
PRIMARY KEY (`GPSLogID`),
UNIQUE INDEX `GPSLogID` (`GPSLogID`),
INDEX `SecondaryUnitID` (`SecondaryAssetID`),
INDEX `ThirdUnitID` (`ThirdAssetID`),
INDEX `DateStamp` (`DateStamp`),
INDEX `PrimaryUnitIDDateStamp` (`PrimaryAssetID`, `DateStamp`, `Status`),
INDEX `Location` (`Location`),
INDEX `DTSaved` (`DTSaved`),
INDEX `PrimaryAssetID` (`PrimaryAssetID`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=153076364
;
原查询如下
SELECT L.GPSLogID, L.DateStamp, L.Status, Location, Latitude, Longitude, GPSFix, Speed, Heading, LifeOdometer, BatteryLevel, Ignition, L.Extra
FROM tblGPSLog L
WHERE PrimaryAssetID = 183 AND L.GPSLogID > 147694199
ORDER BY DateStamp ASC
LIMIT 100;
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
"1","SIMPLE","L","index_merge","PRIMARY,GPSLogID,PrimaryUnitIDDateStamp,PrimaryAssetID","PrimaryAssetID,PRIMARY","9,8",\N,"96","Using intersect(PrimaryAssetID,PRIMARY); Using where; Using filesort"
这在几个月前就出现了问题,经过一番调查后,我将查询更改为以下内容,但现在表现非常相似。
EXPLAIN SELECT GPSLogID, DateStamp, tmpA.Status, Location, Latitude, Longitude, GPSFix, Speed, Heading, LifeOdometer, BatteryLevel, Ignition, tmpA.Extra,
PrimaryAssetID FROM (SELECT L.GPSLogID, L.DateStamp, L.Status, Location, Latitude, Longitude, GPSFix, Speed, Heading, LifeOdometer,
BatteryLevel, Ignition, L.Extra, PrimaryAssetID
FROM tblGPSLog L
WHERE L.GPSLogID > 147694199) AS tmpA
WHERE PrimaryAssetID = 183
ORDER BY DateStamp ASC;
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
"1","PRIMARY","<derived2>","ALL",\N,\N,\N,\N,"5380842","Using where; Using filesort"
"2","DERIVED","L","range","PRIMARY,GPSLogID","PRIMARY","8",\N,"8579290","Using where"
感谢您的建议。
吉姆
I believe setting tx isolation to READ UNCOMMITTED, will stop the SELECT from locking the table.
为什么您认为 READ UNCOMMITTED 会实现这一点?
SELECT 在除了 SERIALIZABLE 之外的所有隔离级别中默认已经是非锁定的。
也就是说,除非您使用 FOR UPDATE
或 FOR SHARE
/ LOCK IN SHARE MODE
,否则 SELECT 始终是非锁定的。当使用 SERIALIZABLE 隔离级别时,SELECT 被隐式转换为锁定 SELECT FOR SHARE
。参见 https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html
我强烈建议永远不要使用 READ UNCOMMITTED。这不是一个好主意,因为您的事务可以读取其他事务的 未提交 工作,这意味着您可以读取不一致的数据(部分完成的事务)和幻像数据(来自未提交的事务的更改)最终回滚)。这样做没有任何好处,而且查询可能会返回错误的结果。
是什么让您认为锁定是性能问题的原因?您是否观察到慢查询日志中锁定时间的增加?
更常见的性能问题是由查询优化不佳或系统资源不足引起的。
如果您的数据库在 8 年以上后变慢了,我猜数据库已经增长到活动数据集不再适合 RAM。
回复您的评论:
Is there a tool or way to investigate this further? I know the query that causing the issue, just can't determine why
有很多工具和方法可以进行调查。关于这个主题的书籍有 High Performance MySQL, and whole companies devoted to creating performance monitoring tools, like Percona and VividCortex。
在不了解更多具体细节的情况下,我无法猜测建议。如果您需要更多帮助,能否请您编辑上面的原始问题并添加:
- SQL 查询有问题。
- 遇到问题的查询的
EXPLAIN <query>
的输出。
- 查询引用的每个 table 的
SHOW CREATE TABLE <tablename>
的输出。你可以在MySQL客户端运行这个语句。
这是给初学者的。
你的陈述
its rare that an SELECT would hit the table while INSERT is happening and even if it does, it wouldn't cause any great issues.
DELETE statements are scheduled once a week only at off peak hours,
等于"Changing the isolation mode won't help much."
我建议设置 long_query_time=1
并打开慢日志。稍后,使用 pt-query-digest
查看慢日志以找到少数 "worst" 查询。那么我们来讨论改进它们吧。
更多
INDEX `PrimaryUnitIDDateStamp` (`PrimaryAssetID`, `DateStamp`,
INDEX `PrimaryAssetID` (`PrimaryAssetID`)
第一个负责第二个,所以第二个是不必要的。
PRIMARY KEY (`GPSLogID`),
UNIQUE INDEX `GPSLogID` (`GPSLogID`),
PK 是一个 UNIQUE 键,所以查下其中的第二个。额外的 unique 索引会减慢插入速度并浪费磁盘 space.
在此,我认为没有理由进行查询和子查询:
SELECT GPSLogID, DateStamp, tmpA.Status, Location, Latitude,
Longitude, GPSFix, Speed, Heading, LifeOdometer, BatteryLevel,
Ignition, tmpA.Extra, PrimaryAssetID
FROM
( SELECT L.GPSLogID, L.DateStamp, L.Status, Location, Latitude,
Longitude, GPSFix, Speed, Heading, LifeOdometer, BatteryLevel,
Ignition, L.Extra, PrimaryAssetID
FROM tblGPSLog L
WHERE L.GPSLogID > 147694199
) AS tmpA
WHERE PrimaryAssetID = 183
ORDER BY DateStamp ASC;
一对DECIMAL(11,8)
加起来就是12个字节,对于lat&lng来说有点大材小用了。请参阅 this 了解更小的替代方案。
table 的规模一直在增长,对吗?而且,在它变得如此之大之后,性能急剧下降?缩小数据类型以缩小 table 是一种方法,尽管是临时修复。
Using intersect(PrimaryAssetID,PRIMARY)
-- 几乎总是,构建复合索引比使用"Index merge intersect".
更好
虽然
INDEX `PrimaryAssetID` (`PrimaryAssetID`)
应该等同于
INDEX `PrimaryAssetID` (`PrimaryAssetID`, GPSLogID)
有什么东西在阻止它。建议您添加此 2 列复合索引。也许很大一部分行有 PrimaryAssetID = 183
?方便的话请SELECT COUNT(*) FROM tblgpslog WHERE PrimaryAssetID = 183
您要从该日志中清除 'old' 数据吗?如果是这样,最佳方式涉及 PARTITIONing
;参见 this。
最初的问题是基于哪里最好将 tx 隔离设置为 READ UNCOMMITTED,但在一些建议之后,我最初的想法似乎是不正确的。
DDL
CREATE TABLE `tblgpslog` (
`GPSLogID` BIGINT(20) NOT NULL AUTO_INCREMENT,
`DTSaved` DATETIME NULL DEFAULT NULL,
`PrimaryAssetID` BIGINT(20) NULL DEFAULT NULL,
`SecondaryAssetID` BIGINT(20) NULL DEFAULT NULL,
`ThirdAssetID` BIGINT(20) NULL DEFAULT NULL,
`JourneyType` CHAR(1) NOT NULL DEFAULT 'B',
`DateStamp` DATETIME NULL DEFAULT NULL,
`Status` VARCHAR(50) NULL DEFAULT NULL,
`Location` VARCHAR(255) NULL DEFAULT '',
`Latitude` DECIMAL(11,8) NULL DEFAULT NULL,
`Longitude` DECIMAL(11,8) NULL DEFAULT NULL,
`GPSFix` CHAR(2) NULL DEFAULT NULL,
`Speed` BIGINT(20) NULL DEFAULT NULL,
`Heading` INT(11) NULL DEFAULT NULL,
`LifeOdometer` BIGINT(20) NULL DEFAULT NULL,
`Extra` VARCHAR(20) NULL DEFAULT NULL,
`BatteryLevel` VARCHAR(5) NULL DEFAULT '--',
`Ignition` TINYINT(4) NOT NULL DEFAULT '1',
`Radius` INT(11) NOT NULL DEFAULT '0',
`GSMLatitude` DECIMAL(11,8) NOT NULL DEFAULT '0.00000000',
`GSMLongitude` DECIMAL(11,8) NOT NULL DEFAULT '0.00000000',
PRIMARY KEY (`GPSLogID`),
UNIQUE INDEX `GPSLogID` (`GPSLogID`),
INDEX `SecondaryUnitID` (`SecondaryAssetID`),
INDEX `ThirdUnitID` (`ThirdAssetID`),
INDEX `DateStamp` (`DateStamp`),
INDEX `PrimaryUnitIDDateStamp` (`PrimaryAssetID`, `DateStamp`, `Status`),
INDEX `Location` (`Location`),
INDEX `DTSaved` (`DTSaved`),
INDEX `PrimaryAssetID` (`PrimaryAssetID`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=153076364
;
原查询如下
SELECT L.GPSLogID, L.DateStamp, L.Status, Location, Latitude, Longitude, GPSFix, Speed, Heading, LifeOdometer, BatteryLevel, Ignition, L.Extra
FROM tblGPSLog L
WHERE PrimaryAssetID = 183 AND L.GPSLogID > 147694199
ORDER BY DateStamp ASC
LIMIT 100;
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
"1","SIMPLE","L","index_merge","PRIMARY,GPSLogID,PrimaryUnitIDDateStamp,PrimaryAssetID","PrimaryAssetID,PRIMARY","9,8",\N,"96","Using intersect(PrimaryAssetID,PRIMARY); Using where; Using filesort"
这在几个月前就出现了问题,经过一番调查后,我将查询更改为以下内容,但现在表现非常相似。
EXPLAIN SELECT GPSLogID, DateStamp, tmpA.Status, Location, Latitude, Longitude, GPSFix, Speed, Heading, LifeOdometer, BatteryLevel, Ignition, tmpA.Extra,
PrimaryAssetID FROM (SELECT L.GPSLogID, L.DateStamp, L.Status, Location, Latitude, Longitude, GPSFix, Speed, Heading, LifeOdometer,
BatteryLevel, Ignition, L.Extra, PrimaryAssetID
FROM tblGPSLog L
WHERE L.GPSLogID > 147694199) AS tmpA
WHERE PrimaryAssetID = 183
ORDER BY DateStamp ASC;
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
"1","PRIMARY","<derived2>","ALL",\N,\N,\N,\N,"5380842","Using where; Using filesort"
"2","DERIVED","L","range","PRIMARY,GPSLogID","PRIMARY","8",\N,"8579290","Using where"
感谢您的建议。
吉姆
I believe setting tx isolation to READ UNCOMMITTED, will stop the SELECT from locking the table.
为什么您认为 READ UNCOMMITTED 会实现这一点?
SELECT 在除了 SERIALIZABLE 之外的所有隔离级别中默认已经是非锁定的。
也就是说,除非您使用 FOR UPDATE
或 FOR SHARE
/ LOCK IN SHARE MODE
,否则 SELECT 始终是非锁定的。当使用 SERIALIZABLE 隔离级别时,SELECT 被隐式转换为锁定 SELECT FOR SHARE
。参见 https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html
我强烈建议永远不要使用 READ UNCOMMITTED。这不是一个好主意,因为您的事务可以读取其他事务的 未提交 工作,这意味着您可以读取不一致的数据(部分完成的事务)和幻像数据(来自未提交的事务的更改)最终回滚)。这样做没有任何好处,而且查询可能会返回错误的结果。
是什么让您认为锁定是性能问题的原因?您是否观察到慢查询日志中锁定时间的增加?
更常见的性能问题是由查询优化不佳或系统资源不足引起的。
如果您的数据库在 8 年以上后变慢了,我猜数据库已经增长到活动数据集不再适合 RAM。
回复您的评论:
Is there a tool or way to investigate this further? I know the query that causing the issue, just can't determine why
有很多工具和方法可以进行调查。关于这个主题的书籍有 High Performance MySQL, and whole companies devoted to creating performance monitoring tools, like Percona and VividCortex。
在不了解更多具体细节的情况下,我无法猜测建议。如果您需要更多帮助,能否请您编辑上面的原始问题并添加:
- SQL 查询有问题。
- 遇到问题的查询的
EXPLAIN <query>
的输出。 - 查询引用的每个 table 的
SHOW CREATE TABLE <tablename>
的输出。你可以在MySQL客户端运行这个语句。
这是给初学者的。
你的陈述
its rare that an SELECT would hit the table while INSERT is happening and even if it does, it wouldn't cause any great issues. DELETE statements are scheduled once a week only at off peak hours,
等于"Changing the isolation mode won't help much."
我建议设置 long_query_time=1
并打开慢日志。稍后,使用 pt-query-digest
查看慢日志以找到少数 "worst" 查询。那么我们来讨论改进它们吧。
更多
INDEX `PrimaryUnitIDDateStamp` (`PrimaryAssetID`, `DateStamp`,
INDEX `PrimaryAssetID` (`PrimaryAssetID`)
第一个负责第二个,所以第二个是不必要的。
PRIMARY KEY (`GPSLogID`),
UNIQUE INDEX `GPSLogID` (`GPSLogID`),
PK 是一个 UNIQUE 键,所以查下其中的第二个。额外的 unique 索引会减慢插入速度并浪费磁盘 space.
在此,我认为没有理由进行查询和子查询:
SELECT GPSLogID, DateStamp, tmpA.Status, Location, Latitude,
Longitude, GPSFix, Speed, Heading, LifeOdometer, BatteryLevel,
Ignition, tmpA.Extra, PrimaryAssetID
FROM
( SELECT L.GPSLogID, L.DateStamp, L.Status, Location, Latitude,
Longitude, GPSFix, Speed, Heading, LifeOdometer, BatteryLevel,
Ignition, L.Extra, PrimaryAssetID
FROM tblGPSLog L
WHERE L.GPSLogID > 147694199
) AS tmpA
WHERE PrimaryAssetID = 183
ORDER BY DateStamp ASC;
一对DECIMAL(11,8)
加起来就是12个字节,对于lat&lng来说有点大材小用了。请参阅 this 了解更小的替代方案。
table 的规模一直在增长,对吗?而且,在它变得如此之大之后,性能急剧下降?缩小数据类型以缩小 table 是一种方法,尽管是临时修复。
Using intersect(PrimaryAssetID,PRIMARY)
-- 几乎总是,构建复合索引比使用"Index merge intersect".
虽然
INDEX `PrimaryAssetID` (`PrimaryAssetID`)
应该等同于
INDEX `PrimaryAssetID` (`PrimaryAssetID`, GPSLogID)
有什么东西在阻止它。建议您添加此 2 列复合索引。也许很大一部分行有 PrimaryAssetID = 183
?方便的话请SELECT COUNT(*) FROM tblgpslog WHERE PrimaryAssetID = 183
您要从该日志中清除 'old' 数据吗?如果是这样,最佳方式涉及 PARTITIONing
;参见 this。