涉及 SELECT FOR UPDATE 的死锁
Deadlock involving SELECT FOR UPDATE
我有几个查询的事务。首先,select 行带有 FOR UPDATE
锁:
SELECT f.source_id FROM files AS f WHERE
f.component_id = AND
f.archived_at IS NULL
FOR UPDATE
接下来是更新查询:
UPDATE files AS f SET archived_at = NOW()
WHERE
hw_component_id = AND
f.source_id = ANY(::text[])
然后有一个插入:
INSERT INTO files AS f (
source_id,
...
)
VALUES (..)
ON CONFLICT (component_id, source_id) DO UPDATE
SET archived_at = null,
is_valid = excluded.is_valid
我有两个应用程序实例,有时我会在 PostgreSQL 日志中看到死锁错误:
ERROR: deadlock detected
DETAIL: Process 3992939 waits for ShareLock on transaction 230221362; blocked by process 4108096.
Process 4108096 waits for ShareLock on transaction 230221365; blocked by process 3992939.
Process 3992939: SELECT f.source_id FROM files AS f WHERE f.component_id = AND f.archived_at IS NULL FOR UPDATE
Process 4108096: INSERT INTO files AS f (source_id, ...) VALUES (..) ON CONFLICT (component_id, source_id) DO UPDATE SET archived_at = null, is_valid = excluded.is_valid
CONTEXT: while locking tuple (41116,185) in relation \"files\"
我假设它可能是由 ON CONFLICT DO UPDATE
语句引起的,它可能更新未被先前 SELECT FOR UPDATE
锁定的行
但我不明白 SELECT ... FOR UPDATE
查询如果是事务中的第一个查询怎么会导致死锁。之前没有查询。
SELECT ... FOR UPDATE
语句是否可以锁定几行然后等待条件中的其他行被解锁?
SELECT FOR UPDATE
不能防止死锁。它只是锁定行。沿途获取锁,按照 ORDER BY
指示的顺序,或者在没有 ORDER BY
的情况下以任意顺序获取。防止死锁的最佳方法是在整个事务中以一致的顺序锁定行——在所有并发事务中也这样做。或者,如 the manual puts it:
The best defense against deadlocks is generally to avoid them by being
certain that all applications using a database acquire locks on
multiple objects in a consistent order.
否则,这可能会发生(row1、row2、...是根据虚拟一致顺序编号的行):
T1: SELECT FOR UPDATE ... -- lock row2, row3
T2: SELECT FOR UPDATE ... -- lock row4, wait for T1 to release row2
T1: INSERT ... ON CONFLICT ... -- wait for T2 to release lock on row4
--> deadlock
将 ORDER BY
添加到您的 SELECT... FOR UPDATE
可能 已经避免了死锁。 (它会避免上面演示的那个。)或者发生这种情况,你必须做更多:
T1: SELECT FOR UPDATE ... -- lock row2, row3
T2: SELECT FOR UPDATE ... -- lock row1, wait for T1 to release row2
T1: INSERT ... ON CONFLICT ... -- wait for T2 to release lock on row1
--> deadlock
交易中的一切都必须以一致的顺序发生才能绝对确定。
还有,你的UPDATE
好像不符合SELECT FOR UPDATE
。 component_id
<> hw_component_id
。打字错误?
此外,f.archived_at IS NULL
不保证后面的 SET archived_at = NOW()
只影响这些行。您必须将 WHERE f.archived_at IS NULL
添加到 UPDATE
行中。 (无论如何看起来都是个好主意?)
I assume that it may be caused by ON CONFLICT DO UPDATE
statement,
which may update rows which are not locked by previous SELECT FOR UPDATE
.
只要 UPSERT (ON CONFLICT DO UPDATE
) 坚持一致的顺序,那将不是问题。但这可能很难或不可能执行。
Can SELECT ... FOR UPDATE
statement lock several rows and then wait for other rows in condition to be unlocked?
是的,如上所述,锁是一路获取的。它可能不得不在中途停下来等待。
NOWAIT
如果所有这些仍然不能解决你的死锁,缓慢而可靠的方法是使用Serializable Isolation Level。那么您必须为序列化失败做好准备,并在这种情况下重试事务。整体上要贵得多。
或者添加 NOWAIT
:
可能就足够了
SELECT FROM files
WHERE component_id =
AND archived_at IS NULL
ORDER BY id -- whatever you use for consistent, deterministic order
FOR UPDATE NOWAIT;
With NOWAIT
, the statement reports an error, rather than waiting, if a selected row cannot be locked immediately.
如果您无论如何都无法与 UPSERT 建立一致的顺序,您甚至可以跳过带有 NOWAIT
的 ORDER BY
子句。
然后您必须捕获该错误并重试交易。类似于捕获序列化失败,但更便宜 - 并且不太可靠。例如,多个事务仍然可以单独与它们的 UPSERT 互锁。但这种可能性越来越小。
我有几个查询的事务。首先,select 行带有 FOR UPDATE
锁:
SELECT f.source_id FROM files AS f WHERE
f.component_id = AND
f.archived_at IS NULL
FOR UPDATE
接下来是更新查询:
UPDATE files AS f SET archived_at = NOW()
WHERE
hw_component_id = AND
f.source_id = ANY(::text[])
然后有一个插入:
INSERT INTO files AS f (
source_id,
...
)
VALUES (..)
ON CONFLICT (component_id, source_id) DO UPDATE
SET archived_at = null,
is_valid = excluded.is_valid
我有两个应用程序实例,有时我会在 PostgreSQL 日志中看到死锁错误:
ERROR: deadlock detected
DETAIL: Process 3992939 waits for ShareLock on transaction 230221362; blocked by process 4108096.
Process 4108096 waits for ShareLock on transaction 230221365; blocked by process 3992939.
Process 3992939: SELECT f.source_id FROM files AS f WHERE f.component_id = AND f.archived_at IS NULL FOR UPDATE
Process 4108096: INSERT INTO files AS f (source_id, ...) VALUES (..) ON CONFLICT (component_id, source_id) DO UPDATE SET archived_at = null, is_valid = excluded.is_valid
CONTEXT: while locking tuple (41116,185) in relation \"files\"
我假设它可能是由 ON CONFLICT DO UPDATE
语句引起的,它可能更新未被先前 SELECT FOR UPDATE
但我不明白 SELECT ... FOR UPDATE
查询如果是事务中的第一个查询怎么会导致死锁。之前没有查询。
SELECT ... FOR UPDATE
语句是否可以锁定几行然后等待条件中的其他行被解锁?
SELECT FOR UPDATE
不能防止死锁。它只是锁定行。沿途获取锁,按照 ORDER BY
指示的顺序,或者在没有 ORDER BY
的情况下以任意顺序获取。防止死锁的最佳方法是在整个事务中以一致的顺序锁定行——在所有并发事务中也这样做。或者,如 the manual puts it:
The best defense against deadlocks is generally to avoid them by being certain that all applications using a database acquire locks on multiple objects in a consistent order.
否则,这可能会发生(row1、row2、...是根据虚拟一致顺序编号的行):
T1: SELECT FOR UPDATE ... -- lock row2, row3
T2: SELECT FOR UPDATE ... -- lock row4, wait for T1 to release row2
T1: INSERT ... ON CONFLICT ... -- wait for T2 to release lock on row4
--> deadlock
将 ORDER BY
添加到您的 SELECT... FOR UPDATE
可能 已经避免了死锁。 (它会避免上面演示的那个。)或者发生这种情况,你必须做更多:
T1: SELECT FOR UPDATE ... -- lock row2, row3
T2: SELECT FOR UPDATE ... -- lock row1, wait for T1 to release row2
T1: INSERT ... ON CONFLICT ... -- wait for T2 to release lock on row1
--> deadlock
交易中的一切都必须以一致的顺序发生才能绝对确定。
还有,你的UPDATE
好像不符合SELECT FOR UPDATE
。 component_id
<> hw_component_id
。打字错误?
此外,f.archived_at IS NULL
不保证后面的 SET archived_at = NOW()
只影响这些行。您必须将 WHERE f.archived_at IS NULL
添加到 UPDATE
行中。 (无论如何看起来都是个好主意?)
I assume that it may be caused by
ON CONFLICT DO UPDATE
statement, which may update rows which are not locked by previousSELECT FOR UPDATE
.
只要 UPSERT (ON CONFLICT DO UPDATE
) 坚持一致的顺序,那将不是问题。但这可能很难或不可能执行。
Can
SELECT ... FOR UPDATE
statement lock several rows and then wait for other rows in condition to be unlocked?
是的,如上所述,锁是一路获取的。它可能不得不在中途停下来等待。
NOWAIT
如果所有这些仍然不能解决你的死锁,缓慢而可靠的方法是使用Serializable Isolation Level。那么您必须为序列化失败做好准备,并在这种情况下重试事务。整体上要贵得多。
或者添加 NOWAIT
:
SELECT FROM files
WHERE component_id =
AND archived_at IS NULL
ORDER BY id -- whatever you use for consistent, deterministic order
FOR UPDATE NOWAIT;
With
NOWAIT
, the statement reports an error, rather than waiting, if a selected row cannot be locked immediately.
如果您无论如何都无法与 UPSERT 建立一致的顺序,您甚至可以跳过带有 NOWAIT
的 ORDER BY
子句。
然后您必须捕获该错误并重试交易。类似于捕获序列化失败,但更便宜 - 并且不太可靠。例如,多个事务仍然可以单独与它们的 UPSERT 互锁。但这种可能性越来越小。