MySQL 导致死锁的触发器已通过锁定表解决
MySQL trigger causing deadlock resolved with Lock Tables
一段时间以来,我一直在与 MySQL 死锁问题作斗争。我们有很多 table 的日志记录数据,然后有插入后触发器,每分钟提取 statistics/summary 数据保存到另一个摘要 table。
显然,这将导致这些插入中的多个影响摘要中的同一行 table。但是因为没有东西在等待插入的结果继续,所以这不应该导致死锁。插入是分批完成的——每隔几毫秒使用一次批量插入。它们可以同时从不同的应用程序完成。
由于这些批量插入语句从来都不是较大事务的一部分,所以我不太明白为什么会导致死锁。如果有人能解释为什么会发生这种情况,将不胜感激!从错误日志中,我只看到多行:
RECORD LOCKS space id 118597 page no 67 n bits 80 index PRIMARY of table `logschema`.`table_summary_stats` /* Partition `p_2020_11_02` */ trx id 7600352476 lock_mode X locks rec but not gap
Record lock, heap no 11 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
现在,看来我终于设法通过在执行批处理之前使用“lock tables”语句手动执行 mysql table 锁定来摆脱死锁插入。
我知道在 innodb table 上做 table 级别的锁是非常不受欢迎的,但是自从我添加了这个 table 锁后,我还没有看到死锁发生。
Table 级别的锁可以解决这样的死锁问题吗?它是解决此类问题的一种 acceptable 方法还是应该在使用 innodb table 时不惜一切代价避免 table 锁?
编辑:摘要 table 如下所示:
CREATE TABLE `table_summary_stats` (
`id` bigint DEFAULT NULL,
`DateAndTime` datetime NOT NULL,
`address` varchar(45) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`group` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`result` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`count` int DEFAULT NULL,
PRIMARY KEY (`DateAndTime`,`group`,`result`,`address`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
/*!50100 PARTITION BY RANGE (to_days(`DateAndTime`))
(PARTITION p_2020_10_26 VALUES LESS THAN (738090) ENGINE = InnoDB,
PARTITION p_2020_11_10 VALUES LESS THAN (738105) ENGINE = InnoDB,
PARTITION overflow VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */;
触发器执行此操作:
INSERT INTO table_summary_stats
SET
DateAndTime = date_format(from_unixtime(NEW.appEpochMilli/1000), '%Y-%m-%d %H:%i:00'),
address = NEW.address,
group = NEW.group,
result = NEW.result,
count = 1
on duplicate key
update
count = count + 1
而死锁的相关信息如下:
------------------------
LATEST DETECTED DEADLOCK
------------------------
2020-11-02 20:00:53 0x7f0cc032a700
*** (1) TRANSACTION:
TRANSACTION 7600352761, ACTIVE 0 sec inserting
mysql tables in use 2, locked 2
LOCK WAIT 4 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 3
MySQL thread id 874850, OS thread handle 139654885635840, query id 3299800570 10.15.0.91 cdrwriter update
INSERT INTO table_summary_stats
SET
DateAndTime = date_format(from_unixtime(NEW.appEpochMilli/1000), '%Y-%m-%d %H:%i:00'),
address = NEW.address,
group = NEW.group,
result = NEW.result,
count = 1
on duplicate key
update
count = count + 1
*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 118597 page no 67 n bits 80 index PRIMARY of table `sms_cdr`.`table_summary_stats` /* Partition `p_2020_11_02` */ trx id 7600352761 lock_mode X locks rec but not gap
Record lock, heap no 10 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
0: len 5; hex 99a7c53ec0; asc > ;;
1: len 4; hex 74657374; asc test;;
2: len 30; hex 7b0a202022737461747573223a20226572726f72222c0a202022636f6465; asc { "status": "error", "code; (total 76 bytes);
3: len 11; hex 3933373931303130353131; asc 93791010511;;
4: len 6; hex 0001c5042df9; asc - ;;
5: len 7; hex 01000053520238; asc SR 8;;
6: SQL NULL;
7: len 4; hex 80057c22; asc |";;
8: len 8; hex 80000000642f4d05; asc d/M ;;
9: len 8; hex 8000000000c03473; asc 4s;;
10: len 8; hex 800000001a7e7aee; asc ~z ;;
11: len 8; hex 8000000000f2b5b1; asc ;;
12: len 8; hex 800000008060b217; asc ` ;;
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 118597 page no 67 n bits 80 index PRIMARY of table `sms_cdr`.`table_summary_stats` /* Partition `p_2020_11_02` */ trx id 7600352761 lock_mode X locks rec but not gap waiting
Record lock, heap no 11 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
0: len 5; hex 99a7c54000; asc @ ;;
1: len 4; hex 74657374; asc test;;
2: len 30; hex 7b0a202022737461747573223a20226572726f72222c0a202022636f6465; asc { "status": "error", "code; (total 76 bytes);
3: len 11; hex 3933373931303130353131; asc 93791010511;;
4: len 6; hex 0001c5042cdc; asc , ;;
5: len 7; hex 02000004ea07ff; asc ;;
6: SQL NULL;
7: len 4; hex 8003095b; asc [;;
8: len 8; hex 8000000036a3a0bb; asc 6 ;;
9: len 8; hex 8000000000785507; asc xU ;;
10: len 8; hex 800000000e23089a; asc # ;;
11: len 8; hex 80000000008c8e08; asc ;;
12: len 8; hex 8000000045cb8c64; asc E d;;
*** (2) TRANSACTION:
TRANSACTION 7600352476, ACTIVE 0 sec inserting
mysql tables in use 2, locked 2
LOCK WAIT 4 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 75
MySQL thread id 874775, OS thread handle 139672774735616, query id 3299800787 10.15.0.90 cdrwriter update
INSERT INTO table_summary_stats
SET
DateAndTime = date_format(from_unixtime(NEW.appEpochMilli/1000), '%Y-%m-%d %H:%i:00'),
address = NEW.address,
group = NEW.group,
result = NEW.result,
count = 1
on duplicate key
update
count = count + 1
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 118597 page no 67 n bits 80 index PRIMARY of table `sms_cdr`.`table_summary_stats` /* Partition `p_2020_11_02` */ trx id 7600352476 lock_mode X locks rec but not gap
Record lock, heap no 11 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
0: len 5; hex 99a7c54000; asc @ ;;
1: len 4; hex 74657374; asc test;;
2: len 30; hex 7b0a202022737461747573223a20226572726f72222c0a202022636f6465; asc { "status": "error", "code; (total 76 bytes);
3: len 11; hex 3933373931303130353131; asc 93791010511;;
4: len 6; hex 0001c5042cdc; asc , ;;
5: len 7; hex 02000004ea07ff; asc ;;
6: SQL NULL;
7: len 4; hex 8003095b; asc [;;
8: len 8; hex 8000000036a3a0bb; asc 6 ;;
9: len 8; hex 8000000000785507; asc xU ;;
10: len 8; hex 800000000e23089a; asc # ;;
11: len 8; hex 80000000008c8e08; asc ;;
12: len 8; hex 8000000045cb8c64; asc E d;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 118597 page no 67 n bits 80 index PRIMARY of table `sms_cdr`.`table_summary_stats` /* Partition `p_2020_11_02` */ trx id 7600352476 lock_mode X locks rec but not gap waiting
Record lock, heap no 10 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
0: len 5; hex 99a7c53ec0; asc > ;;
1: len 4; hex 74657374; asc test;;
2: len 30; hex 7b0a202022737461747573223a20226572726f72222c0a202022636f6465; asc { "status": "error", "code; (total 76 bytes);
3: len 11; hex 3933373931303130353131; asc 93791010511;;
4: len 6; hex 0001c5042df9; asc - ;;
5: len 7; hex 01000053520238; asc SR 8;;
6: SQL NULL;
7: len 4; hex 80057c22; asc |";;
8: len 8; hex 80000000642f4d05; asc d/M ;;
9: len 8; hex 8000000000c03473; asc 4s;;
10: len 8; hex 800000001a7e7aee; asc ~z ;;
11: len 8; hex 8000000000f2b5b1; asc ;;
12: len 8; hex 800000008060b217; asc ` ;;
*** WE ROLL BACK TRANSACTION (1)
"The Inserts are done in batches" -- 按 4 列 PK 对每批进行排序。这应该消除许多死锁并将其余部分变成“锁等待”。 (也就是说,当出现死锁时,它可以简单地等待另一个连接完成。)
此外,如果可行,将批次限制为 100 行。
用分区键PRIMARY KEY
start几乎总是没用。
(我同意你应该尽量避免LOCK TABLES
。)
说明
经典死锁是:
I grab row number 1, you grab row 2, then I reach for row 2 (but can't get it) and you reach for row 1 (and can't get it). Neither of us is willing to let go of what we have.
所以裁判介入并强迫我们中的一个在他有回报时让另一个继续完成。
我(或你)不可能(或不切实际)获取所有需要的行;所以这些行实际上是一次抓取一个。想一想正在更改数百万行的巨人 UPDATE
。在我抓取所有这些行时停止一切是不明智的。
这称为“乐观”-- 处理假设它会成功并向前推进。并且 99.999...% 的时间典型事务将在任何其他连接与其发生冲突之前完成。
如果我们以相同的“顺序”(例如PRIMARY KEY
顺序)抓取行,我们中的一个可以完成;另一个可以简单地等待。如果等待只有几毫秒,那么延迟是察觉不到的。 (限制批量大小在这里有所帮助。)
更好?
可能 摆脱触发器并简单地执行两个批处理语句 -- 一个到原始批处理INSERT
,另一个 batch Upsert(aka IODKU)摘要 table.
无论如何,捕获事务中的错误并重放整个事务。
关于high-speed插入的更多讨论:http://mysql.rjweb.org/doc.php/staging_table(虽然不是直接适用,但您可能会找到一些相关提示。)
一段时间以来,我一直在与 MySQL 死锁问题作斗争。我们有很多 table 的日志记录数据,然后有插入后触发器,每分钟提取 statistics/summary 数据保存到另一个摘要 table。 显然,这将导致这些插入中的多个影响摘要中的同一行 table。但是因为没有东西在等待插入的结果继续,所以这不应该导致死锁。插入是分批完成的——每隔几毫秒使用一次批量插入。它们可以同时从不同的应用程序完成。 由于这些批量插入语句从来都不是较大事务的一部分,所以我不太明白为什么会导致死锁。如果有人能解释为什么会发生这种情况,将不胜感激!从错误日志中,我只看到多行:
RECORD LOCKS space id 118597 page no 67 n bits 80 index PRIMARY of table `logschema`.`table_summary_stats` /* Partition `p_2020_11_02` */ trx id 7600352476 lock_mode X locks rec but not gap
Record lock, heap no 11 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
现在,看来我终于设法通过在执行批处理之前使用“lock tables”语句手动执行 mysql table 锁定来摆脱死锁插入。 我知道在 innodb table 上做 table 级别的锁是非常不受欢迎的,但是自从我添加了这个 table 锁后,我还没有看到死锁发生。
Table 级别的锁可以解决这样的死锁问题吗?它是解决此类问题的一种 acceptable 方法还是应该在使用 innodb table 时不惜一切代价避免 table 锁?
编辑:摘要 table 如下所示:
CREATE TABLE `table_summary_stats` (
`id` bigint DEFAULT NULL,
`DateAndTime` datetime NOT NULL,
`address` varchar(45) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`group` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`result` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`count` int DEFAULT NULL,
PRIMARY KEY (`DateAndTime`,`group`,`result`,`address`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
/*!50100 PARTITION BY RANGE (to_days(`DateAndTime`))
(PARTITION p_2020_10_26 VALUES LESS THAN (738090) ENGINE = InnoDB,
PARTITION p_2020_11_10 VALUES LESS THAN (738105) ENGINE = InnoDB,
PARTITION overflow VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */;
触发器执行此操作:
INSERT INTO table_summary_stats
SET
DateAndTime = date_format(from_unixtime(NEW.appEpochMilli/1000), '%Y-%m-%d %H:%i:00'),
address = NEW.address,
group = NEW.group,
result = NEW.result,
count = 1
on duplicate key
update
count = count + 1
而死锁的相关信息如下:
------------------------
LATEST DETECTED DEADLOCK
------------------------
2020-11-02 20:00:53 0x7f0cc032a700
*** (1) TRANSACTION:
TRANSACTION 7600352761, ACTIVE 0 sec inserting
mysql tables in use 2, locked 2
LOCK WAIT 4 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 3
MySQL thread id 874850, OS thread handle 139654885635840, query id 3299800570 10.15.0.91 cdrwriter update
INSERT INTO table_summary_stats
SET
DateAndTime = date_format(from_unixtime(NEW.appEpochMilli/1000), '%Y-%m-%d %H:%i:00'),
address = NEW.address,
group = NEW.group,
result = NEW.result,
count = 1
on duplicate key
update
count = count + 1
*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 118597 page no 67 n bits 80 index PRIMARY of table `sms_cdr`.`table_summary_stats` /* Partition `p_2020_11_02` */ trx id 7600352761 lock_mode X locks rec but not gap
Record lock, heap no 10 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
0: len 5; hex 99a7c53ec0; asc > ;;
1: len 4; hex 74657374; asc test;;
2: len 30; hex 7b0a202022737461747573223a20226572726f72222c0a202022636f6465; asc { "status": "error", "code; (total 76 bytes);
3: len 11; hex 3933373931303130353131; asc 93791010511;;
4: len 6; hex 0001c5042df9; asc - ;;
5: len 7; hex 01000053520238; asc SR 8;;
6: SQL NULL;
7: len 4; hex 80057c22; asc |";;
8: len 8; hex 80000000642f4d05; asc d/M ;;
9: len 8; hex 8000000000c03473; asc 4s;;
10: len 8; hex 800000001a7e7aee; asc ~z ;;
11: len 8; hex 8000000000f2b5b1; asc ;;
12: len 8; hex 800000008060b217; asc ` ;;
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 118597 page no 67 n bits 80 index PRIMARY of table `sms_cdr`.`table_summary_stats` /* Partition `p_2020_11_02` */ trx id 7600352761 lock_mode X locks rec but not gap waiting
Record lock, heap no 11 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
0: len 5; hex 99a7c54000; asc @ ;;
1: len 4; hex 74657374; asc test;;
2: len 30; hex 7b0a202022737461747573223a20226572726f72222c0a202022636f6465; asc { "status": "error", "code; (total 76 bytes);
3: len 11; hex 3933373931303130353131; asc 93791010511;;
4: len 6; hex 0001c5042cdc; asc , ;;
5: len 7; hex 02000004ea07ff; asc ;;
6: SQL NULL;
7: len 4; hex 8003095b; asc [;;
8: len 8; hex 8000000036a3a0bb; asc 6 ;;
9: len 8; hex 8000000000785507; asc xU ;;
10: len 8; hex 800000000e23089a; asc # ;;
11: len 8; hex 80000000008c8e08; asc ;;
12: len 8; hex 8000000045cb8c64; asc E d;;
*** (2) TRANSACTION:
TRANSACTION 7600352476, ACTIVE 0 sec inserting
mysql tables in use 2, locked 2
LOCK WAIT 4 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 75
MySQL thread id 874775, OS thread handle 139672774735616, query id 3299800787 10.15.0.90 cdrwriter update
INSERT INTO table_summary_stats
SET
DateAndTime = date_format(from_unixtime(NEW.appEpochMilli/1000), '%Y-%m-%d %H:%i:00'),
address = NEW.address,
group = NEW.group,
result = NEW.result,
count = 1
on duplicate key
update
count = count + 1
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 118597 page no 67 n bits 80 index PRIMARY of table `sms_cdr`.`table_summary_stats` /* Partition `p_2020_11_02` */ trx id 7600352476 lock_mode X locks rec but not gap
Record lock, heap no 11 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
0: len 5; hex 99a7c54000; asc @ ;;
1: len 4; hex 74657374; asc test;;
2: len 30; hex 7b0a202022737461747573223a20226572726f72222c0a202022636f6465; asc { "status": "error", "code; (total 76 bytes);
3: len 11; hex 3933373931303130353131; asc 93791010511;;
4: len 6; hex 0001c5042cdc; asc , ;;
5: len 7; hex 02000004ea07ff; asc ;;
6: SQL NULL;
7: len 4; hex 8003095b; asc [;;
8: len 8; hex 8000000036a3a0bb; asc 6 ;;
9: len 8; hex 8000000000785507; asc xU ;;
10: len 8; hex 800000000e23089a; asc # ;;
11: len 8; hex 80000000008c8e08; asc ;;
12: len 8; hex 8000000045cb8c64; asc E d;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 118597 page no 67 n bits 80 index PRIMARY of table `sms_cdr`.`table_summary_stats` /* Partition `p_2020_11_02` */ trx id 7600352476 lock_mode X locks rec but not gap waiting
Record lock, heap no 10 PHYSICAL RECORD: n_fields 13; compact format; info bits 0
0: len 5; hex 99a7c53ec0; asc > ;;
1: len 4; hex 74657374; asc test;;
2: len 30; hex 7b0a202022737461747573223a20226572726f72222c0a202022636f6465; asc { "status": "error", "code; (total 76 bytes);
3: len 11; hex 3933373931303130353131; asc 93791010511;;
4: len 6; hex 0001c5042df9; asc - ;;
5: len 7; hex 01000053520238; asc SR 8;;
6: SQL NULL;
7: len 4; hex 80057c22; asc |";;
8: len 8; hex 80000000642f4d05; asc d/M ;;
9: len 8; hex 8000000000c03473; asc 4s;;
10: len 8; hex 800000001a7e7aee; asc ~z ;;
11: len 8; hex 8000000000f2b5b1; asc ;;
12: len 8; hex 800000008060b217; asc ` ;;
*** WE ROLL BACK TRANSACTION (1)
"The Inserts are done in batches" -- 按 4 列 PK 对每批进行排序。这应该消除许多死锁并将其余部分变成“锁等待”。 (也就是说,当出现死锁时,它可以简单地等待另一个连接完成。)
此外,如果可行,将批次限制为 100 行。
用分区键PRIMARY KEY
start几乎总是没用。
(我同意你应该尽量避免LOCK TABLES
。)
说明
经典死锁是:
I grab row number 1, you grab row 2, then I reach for row 2 (but can't get it) and you reach for row 1 (and can't get it). Neither of us is willing to let go of what we have.
所以裁判介入并强迫我们中的一个在他有回报时让另一个继续完成。
我(或你)不可能(或不切实际)获取所有需要的行;所以这些行实际上是一次抓取一个。想一想正在更改数百万行的巨人 UPDATE
。在我抓取所有这些行时停止一切是不明智的。
这称为“乐观”-- 处理假设它会成功并向前推进。并且 99.999...% 的时间典型事务将在任何其他连接与其发生冲突之前完成。
如果我们以相同的“顺序”(例如PRIMARY KEY
顺序)抓取行,我们中的一个可以完成;另一个可以简单地等待。如果等待只有几毫秒,那么延迟是察觉不到的。 (限制批量大小在这里有所帮助。)
更好?
可能 摆脱触发器并简单地执行两个批处理语句 -- 一个到原始批处理INSERT
,另一个 batch Upsert(aka IODKU)摘要 table.
无论如何,捕获事务中的错误并重放整个事务。
关于high-speed插入的更多讨论:http://mysql.rjweb.org/doc.php/staging_table(虽然不是直接适用,但您可能会找到一些相关提示。)