在这种情况下如何避免或尽量减少死锁?
How can I avoid or minimize deadlocks in this situation?
我有一个相对较小的table(目前)。它作为一个奇特的队列工作。每 /second/ 执行一次的作业,向此 table 询问更多工作,每当工作完成时,它们都会告诉 table 工作已完成。
Table 有大约 1000 个条目,并且长期有望有 100k+ 行
每个条目表示需要每分钟执行一次的作业。 Table 托管在 SQL Azure(S2 计划)
Job Starter 执行一个存储过程,请求来自此 table 的工作。基本上,proc 会查看 table,查看哪些任务未在进行中并且已过期,将它们标记为 "in progress" 和 returns 到作业启动器。
当任务完成时,将执行一个简单的更新以告知该任务已完成并将可用于一分钟内的另一个工作周期(称为频率的字段控制此)
问题: 当我向 table 请求更多工作,或试图将条目标记为已完成时,我经常陷入僵局。看起来 ROWLOCK 提示不起作用。 table 我需要索引结构吗?
这是一个检索记录的存储过程(通常一次最多 20 个,由@count 参数控制
CREATE PROCEDURE [dbo].[sp_GetScheduledItems]
@activity NVARCHAR (50), @count INT, @timeout INT=300, @dataCenter NVARCHAR (50)
AS
BEGIN
SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
DECLARE @batchId uniqueidentifier
SELECT @batchId = NEWID()
DECLARE @result int;
DECLARE @process nvarchar(255);
BEGIN TRAN
-- Update rows
UPDATE Schedule
WITH (ROWLOCK)
SET
LastBatchId = @batchId,
LastStartedProcessingId = NEWID(),
LastStartedProcessingTime = GETUTCDATE()
WHERE
ActivityType = @activity AND
IsEnabled = 1 AND
ItemId IN (
SELECT TOP (@count) ItemId
FROM Schedule
WHERE
(LastStartedProcessingId = LastCompletedProcessingId OR LastCompletedProcessingId IS NULL OR DATEDIFF(SECOND, LastStartedProcessingTime, GETUTCDATE()) > @timeout) AND
IsEnabled = 1 AND ActivityType = @activity AND DataCenter = @dataCenter AND
(LastStartedProcessingTime IS NULL OR DATEDIFF(SECOND, LastStartedProcessingTime, GETUTCDATE()) > Frequency)
ORDER BY (DATEDIFF(SECOND, ISNULL(LastStartedProcessingTime, '1/1/2000'), GETUTCDATE()) - Frequency) DESC
)
COMMIT TRAN
-- Return the updated rows
SELECT ItemId, ParentItemId, ItemName, ParentItemName, DataCenter, LastStartedProcessingId, Frequency, LastProcessTime, ActivityType
FROM Schedule
WHERE LastBatchId = @batchId
END
GO
这是一个将条目标记为已完成的存储过程(一次一个)
CREATE PROCEDURE [dbo].[sp_CompleteScheduledItem]
@activity NVARCHAR (50), @itemId UNIQUEIDENTIFIER, @processingId UNIQUEIDENTIFIER, @status NVARCHAR (50), @lastProcessTime DATETIME, @dataCenter NVARCHAR (50)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
UPDATE Schedule WITH (ROWLOCK)
SET
LastCompletedProcessingId = LastStartedProcessingId,
LastCompletedProcessingTime = GETUTCDATE(),
LastCompletedStatus = @status,
LastProcessTime = @lastProcessTime
WHERE
ItemId = @itemId AND
LastStartedProcessingId = @processingId AND
DataCenter = @dataCenter AND
ActivityType = @activity
END
GO
这是 table 本身
CREATE TABLE [dbo].[Schedule](
[ItemId] [uniqueidentifier] NOT NULL,
[ParentItemId] [uniqueidentifier] NOT NULL,
[ActivityType] [nvarchar](50) NOT NULL,
[Frequency] [int] NOT NULL,
[LastBatchId] [uniqueidentifier] NULL,
[LastStartedProcessingId] [uniqueidentifier] NULL,
[LastStartedProcessingTime] [datetime] NULL,
[LastCompletedProcessingId] [uniqueidentifier] NULL,
[LastCompletedProcessingTime] [datetime] NULL,
[LastCompletedStatus] [nvarchar](50) NULL,
[IsEnabled] [bit] NOT NULL,
[LastProcessTime] [datetime] NULL,
[DataCenter] [nvarchar](50) NOT NULL,
[ItemName] [nvarchar](255) NOT NULL,
[ParentItemName] [nvarchar](255) NOT NULL,
CONSTRAINT [PK_Schedule] PRIMARY KEY CLUSTERED
(
[DataCenter] ASC,
[ItemId] ASC,
[ActivityType] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
这是个好问题 :-) 和往常一样,您可以做很多事情,但对于您的情况,我认为我们可以大大简化您的查询。请注意,下面的建议不使用 SERIALIZABLE 隔离级别,在您的情况下,很可能会导致 table 级别锁定以防止幻读发生(并且还会对您的 table 进行所有写访问,好吧, 序列化。您实际上也不需要指定 BEGIN & COMMIT TRAN,因为您只在事务中发出一个语句(尽管在您的情况下它也不会造成伤害)。在这个例子中,我们利用了这样一个事实,即我们实际上可以直接针对子查询发出更新(在本例中为 CTE 的形式),我们还可以删除最后一个 SELECT,因为我们可以 return 直接从 UPDATE 语句中删除结果集。
HTH,
-托比亚斯
SQL 服务器团队
CREATE PROCEDURE [dbo].[sp_GetScheduledItems]
@activity NVARCHAR (50), @count INT, @timeout INT=300, @dataCenter NVARCHAR (50)
AS
BEGIN
SET NOCOUNT ON;
DECLARE @batchId uniqueidentifier
SELECT @batchId = NEWID()
DECLARE @result int;
DECLARE @process nvarchar(255);
-- Update rows
WITH a AS (
SELECT TOP (@count)
*
FROM Schedule
WHERE
(LastStartedProcessingId = LastCompletedProcessingId OR LastCompletedProcessingId IS NULL OR DATEDIFF(SECOND, LastStartedProcessingTime, GETUTCDATE()) > @timeout) AND
IsEnabled = 1 AND ActivityType = @activity AND DataCenter = @dataCenter AND
(LastStartedProcessingTime IS NULL OR DATEDIFF(SECOND, LastStartedProcessingTime, GETUTCDATE()) > Frequency)
ORDER BY (DATEDIFF(SECOND, ISNULL(LastStartedProcessingTime, '1/1/2000'), GETUTCDATE()) - Frequency) DESC
)
UPDATE a SET
LastBatchId = @batchId,
LastStartedProcessingId = NEWID(),
LastStartedProcessingTime = GETUTCDATE()
OUTPUT INSERTED.ItemId, INSERTED.ParentItemId, INSERTED.ItemName, INSERTED.ParentItemName, INSERTED.DataCenter, INSERTED.LastStartedProcessingId, INSERTED.Frequency, INSERTED.LastProcessTime, INSERTED.ActivityType
END
我有一个相对较小的table(目前)。它作为一个奇特的队列工作。每 /second/ 执行一次的作业,向此 table 询问更多工作,每当工作完成时,它们都会告诉 table 工作已完成。
Table 有大约 1000 个条目,并且长期有望有 100k+ 行 每个条目表示需要每分钟执行一次的作业。 Table 托管在 SQL Azure(S2 计划)
Job Starter 执行一个存储过程,请求来自此 table 的工作。基本上,proc 会查看 table,查看哪些任务未在进行中并且已过期,将它们标记为 "in progress" 和 returns 到作业启动器。
当任务完成时,将执行一个简单的更新以告知该任务已完成并将可用于一分钟内的另一个工作周期(称为频率的字段控制此)
问题: 当我向 table 请求更多工作,或试图将条目标记为已完成时,我经常陷入僵局。看起来 ROWLOCK 提示不起作用。 table 我需要索引结构吗?
这是一个检索记录的存储过程(通常一次最多 20 个,由@count 参数控制
CREATE PROCEDURE [dbo].[sp_GetScheduledItems]
@activity NVARCHAR (50), @count INT, @timeout INT=300, @dataCenter NVARCHAR (50)
AS
BEGIN
SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
DECLARE @batchId uniqueidentifier
SELECT @batchId = NEWID()
DECLARE @result int;
DECLARE @process nvarchar(255);
BEGIN TRAN
-- Update rows
UPDATE Schedule
WITH (ROWLOCK)
SET
LastBatchId = @batchId,
LastStartedProcessingId = NEWID(),
LastStartedProcessingTime = GETUTCDATE()
WHERE
ActivityType = @activity AND
IsEnabled = 1 AND
ItemId IN (
SELECT TOP (@count) ItemId
FROM Schedule
WHERE
(LastStartedProcessingId = LastCompletedProcessingId OR LastCompletedProcessingId IS NULL OR DATEDIFF(SECOND, LastStartedProcessingTime, GETUTCDATE()) > @timeout) AND
IsEnabled = 1 AND ActivityType = @activity AND DataCenter = @dataCenter AND
(LastStartedProcessingTime IS NULL OR DATEDIFF(SECOND, LastStartedProcessingTime, GETUTCDATE()) > Frequency)
ORDER BY (DATEDIFF(SECOND, ISNULL(LastStartedProcessingTime, '1/1/2000'), GETUTCDATE()) - Frequency) DESC
)
COMMIT TRAN
-- Return the updated rows
SELECT ItemId, ParentItemId, ItemName, ParentItemName, DataCenter, LastStartedProcessingId, Frequency, LastProcessTime, ActivityType
FROM Schedule
WHERE LastBatchId = @batchId
END
GO
这是一个将条目标记为已完成的存储过程(一次一个)
CREATE PROCEDURE [dbo].[sp_CompleteScheduledItem]
@activity NVARCHAR (50), @itemId UNIQUEIDENTIFIER, @processingId UNIQUEIDENTIFIER, @status NVARCHAR (50), @lastProcessTime DATETIME, @dataCenter NVARCHAR (50)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
UPDATE Schedule WITH (ROWLOCK)
SET
LastCompletedProcessingId = LastStartedProcessingId,
LastCompletedProcessingTime = GETUTCDATE(),
LastCompletedStatus = @status,
LastProcessTime = @lastProcessTime
WHERE
ItemId = @itemId AND
LastStartedProcessingId = @processingId AND
DataCenter = @dataCenter AND
ActivityType = @activity
END
GO
这是 table 本身
CREATE TABLE [dbo].[Schedule](
[ItemId] [uniqueidentifier] NOT NULL,
[ParentItemId] [uniqueidentifier] NOT NULL,
[ActivityType] [nvarchar](50) NOT NULL,
[Frequency] [int] NOT NULL,
[LastBatchId] [uniqueidentifier] NULL,
[LastStartedProcessingId] [uniqueidentifier] NULL,
[LastStartedProcessingTime] [datetime] NULL,
[LastCompletedProcessingId] [uniqueidentifier] NULL,
[LastCompletedProcessingTime] [datetime] NULL,
[LastCompletedStatus] [nvarchar](50) NULL,
[IsEnabled] [bit] NOT NULL,
[LastProcessTime] [datetime] NULL,
[DataCenter] [nvarchar](50) NOT NULL,
[ItemName] [nvarchar](255) NOT NULL,
[ParentItemName] [nvarchar](255) NOT NULL,
CONSTRAINT [PK_Schedule] PRIMARY KEY CLUSTERED
(
[DataCenter] ASC,
[ItemId] ASC,
[ActivityType] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
这是个好问题 :-) 和往常一样,您可以做很多事情,但对于您的情况,我认为我们可以大大简化您的查询。请注意,下面的建议不使用 SERIALIZABLE 隔离级别,在您的情况下,很可能会导致 table 级别锁定以防止幻读发生(并且还会对您的 table 进行所有写访问,好吧, 序列化。您实际上也不需要指定 BEGIN & COMMIT TRAN,因为您只在事务中发出一个语句(尽管在您的情况下它也不会造成伤害)。在这个例子中,我们利用了这样一个事实,即我们实际上可以直接针对子查询发出更新(在本例中为 CTE 的形式),我们还可以删除最后一个 SELECT,因为我们可以 return 直接从 UPDATE 语句中删除结果集。
HTH,
-托比亚斯
SQL 服务器团队
CREATE PROCEDURE [dbo].[sp_GetScheduledItems]
@activity NVARCHAR (50), @count INT, @timeout INT=300, @dataCenter NVARCHAR (50)
AS
BEGIN
SET NOCOUNT ON;
DECLARE @batchId uniqueidentifier
SELECT @batchId = NEWID()
DECLARE @result int;
DECLARE @process nvarchar(255);
-- Update rows
WITH a AS (
SELECT TOP (@count)
*
FROM Schedule
WHERE
(LastStartedProcessingId = LastCompletedProcessingId OR LastCompletedProcessingId IS NULL OR DATEDIFF(SECOND, LastStartedProcessingTime, GETUTCDATE()) > @timeout) AND
IsEnabled = 1 AND ActivityType = @activity AND DataCenter = @dataCenter AND
(LastStartedProcessingTime IS NULL OR DATEDIFF(SECOND, LastStartedProcessingTime, GETUTCDATE()) > Frequency)
ORDER BY (DATEDIFF(SECOND, ISNULL(LastStartedProcessingTime, '1/1/2000'), GETUTCDATE()) - Frequency) DESC
)
UPDATE a SET
LastBatchId = @batchId,
LastStartedProcessingId = NEWID(),
LastStartedProcessingTime = GETUTCDATE()
OUTPUT INSERTED.ItemId, INSERTED.ParentItemId, INSERTED.ItemName, INSERTED.ParentItemName, INSERTED.DataCenter, INSERTED.LastStartedProcessingId, INSERTED.Frequency, INSERTED.LastProcessTime, INSERTED.ActivityType
END