SQL 服务器死锁

SQL Server DeadLocks

在我的生产服务器中,我一直面临死锁问题。我浏览了很多链接,但找不到解决方案。

请寻找踪迹

deadlock-list
 deadlock victim=process500ce08
  process-list
   process id=process500ce08 taskpriority=0 logused=0 waitresource=PAGE: 7:1:3509698 waittime=5074 ownerId=53243840 
   transactionname=implicit_transaction lasttranstarted=2015-10-20T04:00:02.037 XDES=0x2af523970 lockMode=IU schedulerid=6 kpid=7072 
   status=suspended spid=81 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2015-10-20T04:00:02.037 lastbatchcompleted=2015-10-20T04:00:02.033 
   hostname=10.9.52.12 hostpid=0 loginname=prod isolationlevel=read committed (2) xactid=53243840 currentdb=7 lockTimeout=4294967295 
   clientoption1=671088928 clientoption2=128058
    executionStack
     frame procname=adhoc line=1 stmtstart=116 sqlhandle=0x020000008d082701122cdc931bffce58ad37dec1a2f23e9d
UPDATE dbo.InterfaceTable SET ERRORMESSAGE = @P1 , ACK_DATE = @P2  WHERE (SSID = @P3 )     
     frame procname=unknown line=1 sqlhandle=0x000000000000000000000000000000000000000000000000
unknown     
    inputbuf
(@P1 nvarchar(4000),@P2 datetimeoffset,@P3 nvarchar(4000))UPDATE dbo.InterfaceTable SET ERRORMESSAGE = @P1 , ACK_DATE = @P2  WHERE (SSID = @P3 )    
   process id=process6ca8748 taskpriority=0 logused=112 waitresource=PAGE: 7:1:6987355 waittime=4931 ownerId=53242925 
   transactionname=DELETE lasttranstarted=2015-10-20T04:00:01.320 XDES=0x4408a8080 lockMode=IX schedulerid=13 kpid=5820 
   status=suspended spid=123 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2015-10-20T04:00:01.043 lastbatchcompleted=2015-10-20T04:00:01.043 
   clientapp=SQLAgent - TSQL JobStep (Job 0x12C15DD2EBA4CD4DAFEE3482987A95C6 : Step 1) hostname=APP01 hostpid=2972 
   loginname=APP01\sqlsupport isolationlevel=read uncommitted (1) xactid=53242925 currentdb=7 lockTimeout=4294967295 clientoption1=671088928 
   clientoption2=128056
    executionStack
     frame procname=adhoc line=1 sqlhandle=0x02000000bede8d1322a460d49c8399173b5b500cd397607c
Delete from INTERFACE_TABLE
            where ( ERRORMESSAGE is null  or  ERRORMESSAGE like '%AGREEMENT NOT FOUND.;%'  or  ERRORMESSAGE like '%UNI:REALLOCATED%')
            and ACK_DATE is not null
            and ACKID is not null and SSID in(select SSID from INTERFACE_TABLE_2015 with(Nolock) )     
     frame procname=Data_Purging line=91 stmtstart=8514 stmtend=8574 sqlhandle=0x03000700dcdb321339b16c0034a500000100000000000000
exec(@Interface_Execution1)     
     frame procname=adhoc line=1 sqlhandle=0x01000700910a2006e0a057ab040000000000000000000000
exec Data_Purging     
    inputbuf
exec Data_Purging    
  resource-list
   pagelock fileid=1 pageid=3509698 dbid=7 objectname=chola.dbo.INTERFACE_TABLE id=lockc1c6ec00 mode=U associatedObjectId=72057594778353664
    owner-list
     owner id=process6ca8748 mode=U
    waiter-list
     waiter id=process500ce08 mode=IU requestType=wait
   pagelock fileid=1 pageid=6987355 dbid=7 objectname=chola.dbo.INTERFACE_TABLE id=lock25717c080 mode=U associatedObjectId=72057594781958144
    owner-list
     owner id=process500ce08 mode=U
     waiter id=process6ca8748 mode=IX requestType=wait
    waiter-list

更新接口表是一个视图。在更新这个的同时我正在更新 INTERFACE_TABLE table.

INTERFACE_TABLE table 有主键 SSID

请帮我避免死锁。提前致谢。

按此进行:

https://www.mssqltips.com/sqlservertip/2130/finding-sql-server-deadlocks-using-trace-flag-1222/

你可以看到里面有两个 spids (81,123)

81 是受害者。这是 运行宁:

 UPDATE dbo.InterfaceTable SET ERRORMESSAGE = @P1 , ACK_DATE = @P2  WHERE (SSID = @P3 )

它是 运行 来自 hostname=10.9.52.12,loginname=prod

123 获胜,是 运行宁:

Delete from INTERFACE_TABLE
where ( ERRORMESSAGE is null  or  ERRORMESSAGE 
like '%AGREEMENT NOT FOUND.;%'  
or  ERRORMESSAGE like '%UNI:REALLOCATED%')
and ACK_DATE is not null
and ACKID is not null and SSID in(
     select SSID from INTERFACE_TABLE_2015 with(Nolock) 
)     

它来自 hostname=APP01 hostpid=2972,loginname=APP01\sqlsupport 在 SQL Agent job.

有几种方法可以解决这个问题:

  1. 了解这两件事在做什么并改变流程。在我看来,第一个是更新单个错误日志行的客户端应用程序,第二个是清理日志的 SQL 代理作业。您可以设置 SQL 代理清理作业的死锁优先级,以便 it 将失败而不是应用程序日志(这是您想要的)

  2. 长事务会导致死锁。添加索引或优化查询(为什么每次都与 INTERFACE_TABLE_2015 进行比较?)这样它们就不会花费这么长时间

  3. 更改数据库以使用 SNAPSHOT ISOLATION 并神奇地修复大量死锁问题

阅读这些内容有更多经验的人可以准确地推断出这里发生了什么,但这只是一个开始。