调用 NdisAcquireSpinLock 后访问全局变量导致 IRQL_NOT_LESS_OR_EQUAL BSoD

Access to global variable after calling NdisAcquireSpinLock causes IRQL_NOT_LESS_OR_EQUAL BSoD

我有一个 NDIS Filter driver(WinPcap 的更新)并在 Windows 10 10586 x64 VM 上对其进行了测试。我启用了验证程序,它在启动 Wireshark(也就是使用我的驱动程序的功能)时导致 IRQL_NOT_LESS_OR_EQUAL BSoD。

这是转储:

1: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: fffff80137694a20, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000008, bitfield :
    bit 0 : value 0 = read operation, 1 = write operation
    bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: fffff80137694a20, address which referenced memory

Debugging Details:
------------------

***** Debugger could not find nt in module list, module list might be corrupt, error 0x80070057.


DUMP_CLASS: 1

DUMP_QUALIFIER: 400

BUILD_VERSION_STRING:  10586.103.amd64fre.th2_release.160126-1819

SYSTEM_MANUFACTURER:  VMware, Inc.

VIRTUAL_MACHINE:  VMware

SYSTEM_PRODUCT_NAME:  VMware Virtual Platform

SYSTEM_VERSION:  None

BIOS_VENDOR:  Phoenix Technologies LTD

BIOS_VERSION:  6.00

BIOS_DATE:  07/02/2015

BASEBOARD_MANUFACTURER:  Intel Corporation

BASEBOARD_PRODUCT:  440BX Desktop Reference Platform

BASEBOARD_VERSION:  None

DUMP_TYPE:  2

BUGCHECK_P1: fffff80137694a20

BUGCHECK_P2: 2

BUGCHECK_P3: 8

BUGCHECK_P4: fffff80137694a20

READ_ADDRESS: unable to get nt!MiSessionIdBitmap
Unable to get value of nt!MiSessionWsList
 fffff80137694a20 

CURRENT_IRQL:  0

FAULTING_IP: 
+0
fffff801`37694a20 4883ec08        sub     rsp,8

CPU_COUNT: 2

CPU_MHZ: 961

CPU_VENDOR:  GenuineIntel

CPU_FAMILY: 6

CPU_MODEL: 3c

CPU_STEPPING: 3

CPU_MICROCODE: 0,0,0,0 (F,M,S,R)  SIG: 1E'00000000 (cache) 0'00000000 (init)

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  CORRUPT_MODULELIST_AV

BUGCHECK_STR:  AV

ANALYSIS_SESSION_HOST:  AKISN0W-PC

ANALYSIS_SESSION_TIME:  03-18-2016 09:48:01.0434

ANALYSIS_VERSION: 10.0.10586.567 amd64fre

LAST_CONTROL_TRANSFER:  from fffff801373c7fe9 to fffff801373bd480

FAILED_INSTRUCTION_ADDRESS: 
+0
fffff801`37694a20 4883ec08        sub     rsp,8

SYMBOL_ON_RAW_STACK:  1

STACK_ADDR_RAW_STACK_SYMBOL: ffffd0012ba372e8

STACK_COMMAND:  dps ffffd0012ba372e8-0x20 ; kb

STACK_TEXT:  
ffffd001`2ba372c8  fffff801`376876d6
ffffd001`2ba372d0  fffff801`3792eebe
ffffd001`2ba372d8  fffff801`372ef2c2
ffffd001`2ba372e0  fffff800`71272b02 npf!NPF_GetCopyFromOpenArray+0x22 [j:\npcap\packetwin7\npf\npf\openclos.c @ 1084]
ffffd001`2ba372e8  fffff800`71272ec5 npf!NPF_OpenAdapter+0x2d [j:\npcap\packetwin7\npf\npf\openclos.c @ 258]
ffffd001`2ba372f0  00000000`00000000
ffffd001`2ba372f8  00000000`00000000
ffffd001`2ba37300  00000000`00000000
ffffd001`2ba37308  00000000`00000000
ffffd001`2ba37310  fffff801`37694a20
ffffd001`2ba37318  fffff800`71272ec5 npf!NPF_OpenAdapter+0x2d [j:\npcap\packetwin7\npf\npf\openclos.c @ 258]
ffffd001`2ba37320  fffff801`3792eebe
ffffd001`2ba37328  fffff801`372ef2c2
ffffd001`2ba37330  fffff801`37690d68
ffffd001`2ba37338  fffff801`376876d6
ffffd001`2ba37340  fffff801`376860dc


FOLLOWUP_IP: 
npf!NPF_GetCopyFromOpenArray+22 [j:\npcap\packetwin7\npf\npf\openclos.c @ 1084]
fffff800`71272b02 488b1d177e0000  mov     rbx,qword ptr [npf!g_arrOpen (fffff800`7127a920)]

FAULT_INSTR_CODE:  171d8b48

FAULTING_SOURCE_LINE:  j:\npcap\packetwin7\npf\npf\openclos.c

FAULTING_SOURCE_FILE:  j:\npcap\packetwin7\npf\npf\openclos.c

FAULTING_SOURCE_LINE_NUMBER:  1084

FAULTING_SOURCE_CODE:  
  1080:     POPEN_INSTANCE CurOpen;
  1081:     TRACE_ENTER();
  1082: 
  1083:     NdisAcquireSpinLock(&g_OpenArrayLock);
> 1084:     for (CurOpen = g_arrOpen; CurOpen != NULL; CurOpen = CurOpen->Next)
  1085:     {
  1086:         if (CurOpen->AdapterBindingStatus == ADAPTER_BOUND && NPF_EqualAdapterName(&CurOpen->AdapterName, pAdapterName) == TRUE)
  1087:         {
  1088:             NdisReleaseSpinLock(&g_OpenArrayLock);
  1089:             return NPF_DuplicateOpenObject(CurOpen, DeviceExtension);


SYMBOL_NAME:  npf!NPF_GetCopyFromOpenArray+22

FOLLOWUP_NAME:  MachineOwner

DEBUG_FLR_IMAGE_TIMESTAMP:  0

IMAGE_VERSION:  0.6.0.301

MODULE_NAME: Unknown_Module

IMAGE_NAME:  Unknown_Image

BUCKET_ID:  CORRUPT_MODULELIST_AV

PRIMARY_PROBLEM_CLASS:  CORRUPT_MODULELIST

FAILURE_BUCKET_ID:  CORRUPT_MODULELIST_AV

TARGET_TIME:  2016-03-18T01:43:34.000Z

OSBUILD:  10586

OSSERVICEPACK:  0

SERVICEPACK_NUMBER: 0

OS_REVISION: 0

SUITE_MASK:  272

PRODUCT_TYPE:  1

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

OSEDITION:  Windows 10 WinNt TerminalServer SingleUserTS

OS_LOCALE:  

USER_LCID:  0

OSBUILD_TIMESTAMP:  unknown_date

BUILDDATESTAMP_STR:  160126-1819

BUILDLAB_STR:  th2_release

BUILDOSVER_STR:  10.0.10586.103.amd64fre.th2_release.160126-1819

ANALYSIS_SESSION_ELAPSED_TIME: 18d9

ANALYSIS_SOURCE:  KM

FAILURE_ID_HASH_STRING:  km:corrupt_modulelist_av

FAILURE_ID_HASH:  {fc259191-ef0c-6215-476f-d32e5dcaf1b7}

Followup:     MachineOwner
---------

错误的源代码在这里:https://github.com/nmap/npcap/blob/master/packetWin7/npf/npf/Openclos.c

我知道 NdisAcquireSpinLock 调用会将 IRQL 提高到 Dispatch_LEVEL。 WinDbg 似乎说 g_arrOpen 位于可分页内存中,不允许在 Dispatch_LEVEL 中访问。然而,事实是,g_arrOpen 是一个指向 OPEN_INSTANCE 结构的全局变量。 OPEN_INSTANCE 个实例分配在非分页池中。全局变量与驱动程序映像共存,因此它也不能被调出。

所以我不知道这里有什么问题?有什么帮助吗?谢谢!

全局变量不是问题所在。首先,请注意 Arg3 设置了执行位,即换出的内存是代码,而不是数据。您可以通过注意 READ_ADDRESSFAULTING_IP 相同来确认这一点。

那么,让我们更仔细地看一下该代码:

> 1084:     for (CurOpen = g_arrOpen; CurOpen != NULL; CurOpen = CurOpen->Next)
  1085:     {
  1086:         if (CurOpen->AdapterBindingStatus == ADAPTER_BOUND && NPF_EqualAdapterName(&CurOpen->AdapterName, pAdapterName) == TRUE)

这是一个发布版本,所以你不能把指示的行太当回事;然而,问题很可能就在附近。可执行数据的页面错误表明函数调用错误,所以让我们从查看 NPF_EqualAdapterName:

开始
BOOLEAN
NPF_EqualAdapterName(
    PNDIS_STRING s1,
    PNDIS_STRING s2
    )
{
    // return RtlEqualMemory(s1->Buffer, s2->Buffer, s2->Length);
    // We use RtlEqualUnicodeString because it's case-insensitive. However, verifier will complain about this call because it's under DISPATCH_LEVEL.
    // Just don't enable the IRQL switch when testing with verifier.
    return RtlEqualUnicodeString(s1, s2, TRUE);
}

一个非常短的函数,几乎肯定是内联的,所以它不一定会出现在堆栈跟踪中。这导致我们调用 RtlEqualUnicodeString,当我们 check the documentation 结果需要 PASSIVE_LEVEL 时。答对了。 (见鬼,除了验证,我们甚至不需要查看文档,因为 评论直截了当地指出调用是非法的。

结论:RtlEqualUnicodeString恰好在你调用的时候被调出。

(据推测,最好的解决方案是重新使用 RtlEqualMemory 并确保您的比较字符串提前正确区分大小写。)