DPC_WATCHDOG_VIOLATION 等待 spilock 的 BSOD
DPC_WATCHDOG_VIOLATION BSOD with spilock waiting
我有 NDIS FilterDriver,运行 在 Windows 10 (BUILD_VERSION_STRING: 18362.1.amd64fre.19h1_release.190318-1202)。
使用特定测试 PC 上的测试脚本之一,它失败并显示 DPC_WATCHDOG_VIOLATION BSOD。
调用堆栈如下所示:
nt!KeBugCheckEx
nt!KeAccumulateTicks+0x1815bd
nt!KeClockInterruptNotify+0xc07
hal!HalpTimerClockIpiRoutine+0x21
nt!KiCallInterruptServiceRoutine+0xa5
nt!KiInterruptSubDispatchNoLockNoEtw+0xfa
nt!KiInterruptDispatchNoLockNoEtw+0x37
nt!KxWaitForSpinLockAndAcquire+0x33
nt!KeAcquireSpinLockAtDpcLevel+0x5b
MYDRV!FilterReceiveNetBufferLists+0x1c8 [c:\MYDRV_SRC\my_filter.cpp @ 1768]
ndis!ndisCallReceiveHandler+0x60
ndis!ndisInvokeNextReceiveHandler+0x206cf
ndis!NdisMIndicateReceiveNetBufferLists+0x104
e1i65x64!RECEIVE::RxIndicateNBLs+0x12f
e1i65x64!RECEIVE::RxProcessInterrupts+0x20a
e1i65x64!INTERRUPT::MsgIntDpcTxRxProcessing+0x124
e1i65x64!INTERRUPT::MsgIntMessageInterruptDPC+0x1ff
e1i65x64!INTERRUPT::MiniportMessageInterruptDPC+0x28
ndis!ndisInterruptDpc+0x19c
nt!KiExecuteAllDpcs+0x30a
nt!KiRetireDpcList+0x1ef
nt!KxRetireDpcList+0x5
nt!KiDispatchInterruptContinue
nt!KiDpcInterrupt+0x2ee
nt!RtlpHpSegPageRangeShrink+0x2d5
nt!ExFreeHeapPool+0x751
nt!ExFreePool+0x9
MYDRV!StartReqCancel+0x16f [c:\MYDRV_SRC\my_device.cpp @ 1420]
nt!IoCancelIrp+0x71
nt!IopCancelIrpsInCurrentThreadList+0x104
MYDRV!StartReqCancel 是用户 IRP 的取消例程。
执行卡在此处的 FilterReceiveNetBufferLists 中:
NdisDprAcquireSpinLock(&pFilter->startIrpLock);
在 MYDRV!StartReqCancel 取消例程中,此自旋锁被锁定以释放相关资源,例如:
// Clear Cancel routine
IoSetCancelRoutine(pIrp, NULL);
// Release the cancel spinlock
IoReleaseCancelSpinLock(pIrp->CancelIrql);
NdisDprAcquireSpinLock(&pFilter->startIrpLock);
// Clear capture data
//...
for (int i=0; i<pFilter->dataN; ++i )
ExFreePool(pFilter->pCaptData[i]); //non-paged data!
//...
NdisDprReleaseSpinLock(&pFilter->startIrpLock);
// Complete the request
pIrp->IoStatus.Status = STATUS_CANCELLED;
pIrp->IoStatus.Information = 0;
IoCompleteRequest(pIrp, IO_NO_INCREMENT);//CAPTURE_START_IRP
看起来在 ExFreePool 调用期间,HW 驱动程序收到传入数据包的中断,并最终调用我的过滤器驱动程序的 FilterReceiveNetBufferLists,它试图获取 pFilter->startIrpLock 自旋锁(在 DISPATCH_LEVEL),它被 MYDRV!StartReqCancel 锁定。
但 ExFreePool 中的 StartReqCancel 似乎从未 returns。 pFilter->dataN 的值不是太大(<64)。
你知道为什么会这样吗?
刚刚找到了这个奇怪事故的原因,感谢 Driver Verifier:
//....
// Release the cancel spinlock
// Here we have IRQL = DISPATCH_LEVEL here
IoReleaseCancelSpinLock(pIrp->CancelIrql);
// Now we have IRQL < DISPATCH_LEVEL !!!
NdisDprAcquireSpinLock(&pFilter->startIrpLock); // !!! No real spinlock acquisition here!!!
// NON LOCKED code here.
// ....
// NdisDprReleaseSpinLock(&pFilter->startIrpLock); // Nothing unlocked here
我有 NDIS FilterDriver,运行 在 Windows 10 (BUILD_VERSION_STRING: 18362.1.amd64fre.19h1_release.190318-1202)。 使用特定测试 PC 上的测试脚本之一,它失败并显示 DPC_WATCHDOG_VIOLATION BSOD。 调用堆栈如下所示:
nt!KeBugCheckEx
nt!KeAccumulateTicks+0x1815bd
nt!KeClockInterruptNotify+0xc07
hal!HalpTimerClockIpiRoutine+0x21
nt!KiCallInterruptServiceRoutine+0xa5
nt!KiInterruptSubDispatchNoLockNoEtw+0xfa
nt!KiInterruptDispatchNoLockNoEtw+0x37
nt!KxWaitForSpinLockAndAcquire+0x33
nt!KeAcquireSpinLockAtDpcLevel+0x5b
MYDRV!FilterReceiveNetBufferLists+0x1c8 [c:\MYDRV_SRC\my_filter.cpp @ 1768]
ndis!ndisCallReceiveHandler+0x60
ndis!ndisInvokeNextReceiveHandler+0x206cf
ndis!NdisMIndicateReceiveNetBufferLists+0x104
e1i65x64!RECEIVE::RxIndicateNBLs+0x12f
e1i65x64!RECEIVE::RxProcessInterrupts+0x20a
e1i65x64!INTERRUPT::MsgIntDpcTxRxProcessing+0x124
e1i65x64!INTERRUPT::MsgIntMessageInterruptDPC+0x1ff
e1i65x64!INTERRUPT::MiniportMessageInterruptDPC+0x28
ndis!ndisInterruptDpc+0x19c
nt!KiExecuteAllDpcs+0x30a
nt!KiRetireDpcList+0x1ef
nt!KxRetireDpcList+0x5
nt!KiDispatchInterruptContinue
nt!KiDpcInterrupt+0x2ee
nt!RtlpHpSegPageRangeShrink+0x2d5
nt!ExFreeHeapPool+0x751
nt!ExFreePool+0x9
MYDRV!StartReqCancel+0x16f [c:\MYDRV_SRC\my_device.cpp @ 1420]
nt!IoCancelIrp+0x71
nt!IopCancelIrpsInCurrentThreadList+0x104
MYDRV!StartReqCancel 是用户 IRP 的取消例程。 执行卡在此处的 FilterReceiveNetBufferLists 中:
NdisDprAcquireSpinLock(&pFilter->startIrpLock);
在 MYDRV!StartReqCancel 取消例程中,此自旋锁被锁定以释放相关资源,例如:
// Clear Cancel routine
IoSetCancelRoutine(pIrp, NULL);
// Release the cancel spinlock
IoReleaseCancelSpinLock(pIrp->CancelIrql);
NdisDprAcquireSpinLock(&pFilter->startIrpLock);
// Clear capture data
//...
for (int i=0; i<pFilter->dataN; ++i )
ExFreePool(pFilter->pCaptData[i]); //non-paged data!
//...
NdisDprReleaseSpinLock(&pFilter->startIrpLock);
// Complete the request
pIrp->IoStatus.Status = STATUS_CANCELLED;
pIrp->IoStatus.Information = 0;
IoCompleteRequest(pIrp, IO_NO_INCREMENT);//CAPTURE_START_IRP
看起来在 ExFreePool 调用期间,HW 驱动程序收到传入数据包的中断,并最终调用我的过滤器驱动程序的 FilterReceiveNetBufferLists,它试图获取 pFilter->startIrpLock 自旋锁(在 DISPATCH_LEVEL),它被 MYDRV!StartReqCancel 锁定。
但 ExFreePool 中的 StartReqCancel 似乎从未 returns。 pFilter->dataN 的值不是太大(<64)。
你知道为什么会这样吗?
刚刚找到了这个奇怪事故的原因,感谢 Driver Verifier:
//....
// Release the cancel spinlock
// Here we have IRQL = DISPATCH_LEVEL here
IoReleaseCancelSpinLock(pIrp->CancelIrql);
// Now we have IRQL < DISPATCH_LEVEL !!!
NdisDprAcquireSpinLock(&pFilter->startIrpLock); // !!! No real spinlock acquisition here!!!
// NON LOCKED code here.
// ....
// NdisDprReleaseSpinLock(&pFilter->startIrpLock); // Nothing unlocked here