DPC_WATCHDOG_VIOLATION 等待 spilock 的 BSOD

DPC_WATCHDOG_VIOLATION BSOD with spilock waiting

我有 NDIS FilterDriver,运行 在 Windows 10 (BUILD_VERSION_STRING: 18362.1.amd64fre.19h1_release.190318-1202)。 使用特定测试 PC 上的测试脚本之一,它失败并显示 DPC_WATCHDOG_VIOLATION BSOD。 调用堆栈如下所示:

nt!KeBugCheckEx
nt!KeAccumulateTicks+0x1815bd
nt!KeClockInterruptNotify+0xc07
hal!HalpTimerClockIpiRoutine+0x21
nt!KiCallInterruptServiceRoutine+0xa5
nt!KiInterruptSubDispatchNoLockNoEtw+0xfa
nt!KiInterruptDispatchNoLockNoEtw+0x37
nt!KxWaitForSpinLockAndAcquire+0x33
nt!KeAcquireSpinLockAtDpcLevel+0x5b
MYDRV!FilterReceiveNetBufferLists+0x1c8 [c:\MYDRV_SRC\my_filter.cpp @ 1768]
ndis!ndisCallReceiveHandler+0x60
ndis!ndisInvokeNextReceiveHandler+0x206cf
ndis!NdisMIndicateReceiveNetBufferLists+0x104
e1i65x64!RECEIVE::RxIndicateNBLs+0x12f
e1i65x64!RECEIVE::RxProcessInterrupts+0x20a
e1i65x64!INTERRUPT::MsgIntDpcTxRxProcessing+0x124
e1i65x64!INTERRUPT::MsgIntMessageInterruptDPC+0x1ff
e1i65x64!INTERRUPT::MiniportMessageInterruptDPC+0x28
ndis!ndisInterruptDpc+0x19c
nt!KiExecuteAllDpcs+0x30a
nt!KiRetireDpcList+0x1ef
nt!KxRetireDpcList+0x5
nt!KiDispatchInterruptContinue
nt!KiDpcInterrupt+0x2ee
nt!RtlpHpSegPageRangeShrink+0x2d5
nt!ExFreeHeapPool+0x751
nt!ExFreePool+0x9
MYDRV!StartReqCancel+0x16f [c:\MYDRV_SRC\my_device.cpp @ 1420] 
nt!IoCancelIrp+0x71
nt!IopCancelIrpsInCurrentThreadList+0x104

MYDRV!StartReqCancel 是用户 IRP 的取消例程。 执行卡在此处的 FilterReceiveNetBufferLists 中:

NdisDprAcquireSpinLock(&pFilter->startIrpLock);

在 MYDRV!StartReqCancel 取消例程中,此自旋锁被锁定以释放相关资源,例如:

    // Clear Cancel routine
    IoSetCancelRoutine(pIrp, NULL);
    // Release the cancel spinlock
    IoReleaseCancelSpinLock(pIrp->CancelIrql);

    NdisDprAcquireSpinLock(&pFilter->startIrpLock);

    // Clear capture data
    //...
    for (int i=0; i<pFilter->dataN; ++i )
       ExFreePool(pFilter->pCaptData[i]); //non-paged data!
    //...

    NdisDprReleaseSpinLock(&pFilter->startIrpLock);
    
    // Complete the request
    pIrp->IoStatus.Status = STATUS_CANCELLED;
    pIrp->IoStatus.Information = 0;
    IoCompleteRequest(pIrp, IO_NO_INCREMENT);//CAPTURE_START_IRP

看起来在 ExFreePool 调用期间,HW 驱动程序收到传入数据包的中断,并最终调用我的过滤器驱动程序的 FilterReceiveNetBufferLists,它试图获取 pFilter->startIrpLock 自旋锁(在 DISPATCH_LEVEL),它被 MYDRV!StartReqCancel 锁定。
但 ExFreePool 中的 StartReqCancel 似乎从未 returns。 pFilter->dataN 的值不是太大(<64)。

你知道为什么会这样吗?

刚刚找到了这个奇怪事故的原因,感谢 Driver Verifier:

//....

// Release the cancel spinlock
// Here we have  IRQL = DISPATCH_LEVEL here
IoReleaseCancelSpinLock(pIrp->CancelIrql);
// Now we have  IRQL < DISPATCH_LEVEL !!!

NdisDprAcquireSpinLock(&pFilter->startIrpLock); // !!! No real spinlock acquisition here!!!

// NON LOCKED code here.
// ....
// NdisDprReleaseSpinLock(&pFilter->startIrpLock); // Nothing unlocked here