MySQL: 删除同一行时发生死锁

MySQL: A deadlock occurred when deleting the same row

最近在删除记录时遇到死锁(注意隔离级别是REPEATABLE READ,MySQL5.7)

这是重现步骤

1 创建一个新的 table

CREATE TABLE `t` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `name` varchar(32) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `p_name` (`name`)
) ENGINE=InnoDB CHARSET=utf8;

2准备3条记录

insert into t (name) value ('A'), ('C'), ('D');

3

+====================================+============================================================+
|             Session A              |                         Session B                          |
+====================================+============================================================+
| begin;                             |                                                            |
+------------------------------------+------------------------------------------------------------+
|                                    | begin;                                                     |
+------------------------------------+------------------------------------------------------------+
| delete from t where name = 'C';    |                                                            |
+------------------------------------+------------------------------------------------------------+
|                                    | delete from t where name = 'C';  --Blocked!                |
+------------------------------------+------------------------------------------------------------+
| insert into t (name) values ('B'); |                                                            |
+------------------------------------+------------------------------------------------------------+
|                                    | ERROR 1213 (40001): Deadlock found when trying to get lock |
+------------------------------------+------------------------------------------------------------+

show engine innodb status 的结果如下所示(最新检测到的死锁部分)

LATEST DETECTED DEADLOCK
------------------------
*** (1) TRANSACTION:
TRANSACTION 3631, ACTIVE 21 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 13, OS thread handle 123145439432704, query id 306 localhost root updating
delete from t where name = 'C'
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 69 page no 4 n bits 72 index p_name of table `jacky`.`t` trx id 3631 lock_mode X waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 1; hex 43; asc C;;
 1: len 8; hex 8000000000000018; asc         ;;

*** (2) TRANSACTION:
TRANSACTION 3630, ACTIVE 29 sec inserting
mysql tables in use 1, locked 1
5 lock struct(s), heap size 1136, 4 row lock(s), undo log entries 2
MySQL thread id 14, OS thread handle 123145439711232, query id 307 localhost root update
insert into t (name) values ('B')
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 69 page no 4 n bits 72 index p_name of table `jacky`.`t` trx id 3630 lock_mode X
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 1; hex 43; asc C;;
 1: len 8; hex 8000000000000018; asc         ;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 69 page no 4 n bits 72 index p_name of table `jacky`.`t` trx id 3630 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 1; hex 43; asc C;;
 1: len 8; hex 8000000000000018; asc         ;;

如Innodb状态所示,session B正在等待next-key lock C,session A hold a record lock C and等待 C;

上的间隙锁定

众所周知

DELETE FROM ... WHERE ... sets an exclusive next-key lock on every record the search encounters

A next-key lock is a combination of a record lock on the index record and a gap lock on the gap before the index record.

Q1: 我猜是session B先拿到gap lock(next-key的一部分),然后等待record lock。从而,后面在session A中的insert被session B阻塞(由于gap lock),最终导致死锁。对吗?

Q2:由于 C 从索引中清除,会话 B 持有的间隙锁是否应该是 ('A', 'D')?如果是这样,为什么会话 A 正在等待范围 (, 'C') 上的插入强度锁定?

Q3:为什么session B有1 row lock(s),而session A有4 row lock(s)


Q4: 把索引p_name改成唯一索引后,还是会因为间隙锁而死锁,很奇怪。它的行为与官方 doc 不同,后者声明只需要记录锁定。

DELETE FROM ... WHERE ... sets an exclusive next-key lock on every record the search encounters. However, only an index record lock is required for statements that lock rows using a unique index to search for a unique row.


但是使用主键id执行删除是可以的(步骤如下图)。这是 MySQL 中的错误吗?

1 准备数据

delete from t;
insert into t (id, name) value (1, 'A'), (3, 'C'), (5, 'D');

2

+-------------------------------------------+--------------------------------------+
|                 Session A                 |              Session B               |
+-------------------------------------------+--------------------------------------+
| begin;                                    |                                      |
|                                           | begin;                               |
| delete from t where id = 3;               |                                      |
|                                           | delete from t where id = 3; Blocked! |
| insert into t (id, name) values (2, 'B'); |                                      |
|                                           |                                      |
| commit;                                   |                                      |
+-------------------------------------------+--------------------------------------+

从事务 3631 的“WAITING FOR THIS LOCK TO BE GRANTED”部分,我们可以看到:

RECORD LOCKS space id 69 page no 4 n bits 72 index p_name of table `jacky`.`t` trx id 3631 lock_mode X waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
  1. 3631 正在等待记录锁定。对应的索引内容为{"name":"C", "id": 24}.
  2. 索引名称是 p_name in table t.
  3. 锁的模式是“lock_mode X”

从事务 3630 的“WAITING FOR THIS LOCK TO BE GRANTED”部分,我们可以看到:

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 69 page no 4 n bits 72 index p_name of table `jacky`.`t` trx id 3630 lock_mode X
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 1; hex 43; asc C;;
 1: len 8; hex 8000000000000018; asc         ;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:

RECORD LOCKS space id 69 page no 4 n bits 72 index p_name of table `jacky`.`t` trx id 3630 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 1; hex 43; asc C;;
 1: len 8; hex 8000000000000018; asc         ;;
  1. 3630 正在等待记录锁定。对应的索引内容为{"name":"C", "id": 24}。等待锁的模式是“lock_mode X locks gap”
  2. 3630 持有一个记录锁。对应的索引内容为{"name":"C", "id": 24}。持有锁的模式是“lock_mode X locks”
  3. 索引名称是 p_name in table t.
  4. 这个死锁是由于执行“insert into t (name) values ('B')”造成的

根据你的重现步骤,会话A会先发送一个delete from t where name = 'C';,这会锁定:

  1. ('A', 'C'] and ('C', 'D'): next-key lock 'C' and gap lock before 'D';

DELETE FROM ... WHERE ... sets an exclusive next-key lock on every record the search encounters. However, only an index record lock is required for statements that lock rows using a unique index to search for a unique row.

  1. 为'C' 对应的主索引id 添加一个记录锁。这里的id值应该是“26”。

然后会话 B 将开始并且 delete from t where name = 'C'; 将再次执行。然而。对于session B,因为session A还没有提交,'C'已经被session A加锁了。但是,如果执行delete sql,session B会按以下顺序尝试加锁:

  1. gap lock before 'C': 成功,因为innodb可以在同一个位置添加multi gap lock。
  2. 记录锁'C':已阻止,因为会话 A 已持有该锁。会话 B 必须等待会话 A 释放它。
  3. 'D'之前的间隙锁定:

最后,会话 A 发送 insert into t (name) values ('B');。对于tablet,有2个索引,分别是idnameid是一个自增整数主键,对于name,这个sql会尝试加一个插入意向锁。但是,会话 B 持有一个间隙锁,因此会话 A 必须等待会话 B 释放该间隙锁。现在我们可以看到这个死锁是如何发生的。 Innodb 将根据成本选择一个会话进行回滚。这里会话 B 将被回滚。

对于Q1,答案是肯定的。 实际上,对于 Q2,已删除的记录不会在其会话提交之前从索引中清除。 对于Q3,行锁数等于trx_rows_locked,在mysql网站中,其:

TRX_ROWS_LOCKED

The approximate number or rows locked by this transaction. The value might include delete-marked rows that are physically present but not visible to the transaction.

由此article,我们可以知道:

  1. For non-clustered unique index filtering, due to the need to return tables, the number of filtered rows is locked as the unique index plus the number of returned rows.

  2. For non-clustered non-unique index filtering, the gap lock is involved, so more records are locked.

所以,trx_rows_locked (gap lock + next-key lock + return table) 在session A中删除后是3,最终的trx_rows_locked值应该是3 + 1(插入键锁)尝试插入后。


以下为新更新问题: 之前没注意删除主键和唯一副键。

经过一番调查,我发现:

  1. 当删除一个primary key,它已经被删除但还没有提交,新的删除操作只需要record lock而不是next-key锁
  2. 删除一个secondary unique key时,已经删除还没有提交,新的删除操作需要next-key lock

您可以使用 set GLOBAL innodb_status_output_locks=ON; show engine innodb status 查看 运行 交易的详细锁定状态。

对于Q4,我终于找到a comment in MySQL 5.7 source code解释为什么使用next-key锁,仅供参考。

In a search where at most one record in the index may match, we can use a LOCK_REC_NOT_GAP type record lock when locking a non-delete-marked matching record.

Note that in a unique secondary index there may be different delete-marked versions of a record where only the primary key values differ: thus in a secondary index we must use next-key locks when locking delete-marked records

Note above that a UNIQUE secondary index can contain many rows with the same key value if one of the columns is the SQL null. A clustered index under MySQL can never contain null columns because we demand that all the columns in primary key are non-null.