调用 NdisAcquireSpinLock 后访问全局变量导致 IRQL_NOT_LESS_OR_EQUAL BSoD
Access to global variable after calling NdisAcquireSpinLock causes IRQL_NOT_LESS_OR_EQUAL BSoD
我有一个 NDIS Filter driver
(WinPcap 的更新)并在 Windows 10 10586 x64 VM 上对其进行了测试。我启用了验证程序,它在启动 Wireshark(也就是使用我的驱动程序的功能)时导致 IRQL_NOT_LESS_OR_EQUAL
BSoD。
这是转储:
1: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: fffff80137694a20, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000008, bitfield :
bit 0 : value 0 = read operation, 1 = write operation
bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: fffff80137694a20, address which referenced memory
Debugging Details:
------------------
***** Debugger could not find nt in module list, module list might be corrupt, error 0x80070057.
DUMP_CLASS: 1
DUMP_QUALIFIER: 400
BUILD_VERSION_STRING: 10586.103.amd64fre.th2_release.160126-1819
SYSTEM_MANUFACTURER: VMware, Inc.
VIRTUAL_MACHINE: VMware
SYSTEM_PRODUCT_NAME: VMware Virtual Platform
SYSTEM_VERSION: None
BIOS_VENDOR: Phoenix Technologies LTD
BIOS_VERSION: 6.00
BIOS_DATE: 07/02/2015
BASEBOARD_MANUFACTURER: Intel Corporation
BASEBOARD_PRODUCT: 440BX Desktop Reference Platform
BASEBOARD_VERSION: None
DUMP_TYPE: 2
BUGCHECK_P1: fffff80137694a20
BUGCHECK_P2: 2
BUGCHECK_P3: 8
BUGCHECK_P4: fffff80137694a20
READ_ADDRESS: unable to get nt!MiSessionIdBitmap
Unable to get value of nt!MiSessionWsList
fffff80137694a20
CURRENT_IRQL: 0
FAULTING_IP:
+0
fffff801`37694a20 4883ec08 sub rsp,8
CPU_COUNT: 2
CPU_MHZ: 961
CPU_VENDOR: GenuineIntel
CPU_FAMILY: 6
CPU_MODEL: 3c
CPU_STEPPING: 3
CPU_MICROCODE: 0,0,0,0 (F,M,S,R) SIG: 1E'00000000 (cache) 0'00000000 (init)
CUSTOMER_CRASH_COUNT: 1
DEFAULT_BUCKET_ID: CORRUPT_MODULELIST_AV
BUGCHECK_STR: AV
ANALYSIS_SESSION_HOST: AKISN0W-PC
ANALYSIS_SESSION_TIME: 03-18-2016 09:48:01.0434
ANALYSIS_VERSION: 10.0.10586.567 amd64fre
LAST_CONTROL_TRANSFER: from fffff801373c7fe9 to fffff801373bd480
FAILED_INSTRUCTION_ADDRESS:
+0
fffff801`37694a20 4883ec08 sub rsp,8
SYMBOL_ON_RAW_STACK: 1
STACK_ADDR_RAW_STACK_SYMBOL: ffffd0012ba372e8
STACK_COMMAND: dps ffffd0012ba372e8-0x20 ; kb
STACK_TEXT:
ffffd001`2ba372c8 fffff801`376876d6
ffffd001`2ba372d0 fffff801`3792eebe
ffffd001`2ba372d8 fffff801`372ef2c2
ffffd001`2ba372e0 fffff800`71272b02 npf!NPF_GetCopyFromOpenArray+0x22 [j:\npcap\packetwin7\npf\npf\openclos.c @ 1084]
ffffd001`2ba372e8 fffff800`71272ec5 npf!NPF_OpenAdapter+0x2d [j:\npcap\packetwin7\npf\npf\openclos.c @ 258]
ffffd001`2ba372f0 00000000`00000000
ffffd001`2ba372f8 00000000`00000000
ffffd001`2ba37300 00000000`00000000
ffffd001`2ba37308 00000000`00000000
ffffd001`2ba37310 fffff801`37694a20
ffffd001`2ba37318 fffff800`71272ec5 npf!NPF_OpenAdapter+0x2d [j:\npcap\packetwin7\npf\npf\openclos.c @ 258]
ffffd001`2ba37320 fffff801`3792eebe
ffffd001`2ba37328 fffff801`372ef2c2
ffffd001`2ba37330 fffff801`37690d68
ffffd001`2ba37338 fffff801`376876d6
ffffd001`2ba37340 fffff801`376860dc
FOLLOWUP_IP:
npf!NPF_GetCopyFromOpenArray+22 [j:\npcap\packetwin7\npf\npf\openclos.c @ 1084]
fffff800`71272b02 488b1d177e0000 mov rbx,qword ptr [npf!g_arrOpen (fffff800`7127a920)]
FAULT_INSTR_CODE: 171d8b48
FAULTING_SOURCE_LINE: j:\npcap\packetwin7\npf\npf\openclos.c
FAULTING_SOURCE_FILE: j:\npcap\packetwin7\npf\npf\openclos.c
FAULTING_SOURCE_LINE_NUMBER: 1084
FAULTING_SOURCE_CODE:
1080: POPEN_INSTANCE CurOpen;
1081: TRACE_ENTER();
1082:
1083: NdisAcquireSpinLock(&g_OpenArrayLock);
> 1084: for (CurOpen = g_arrOpen; CurOpen != NULL; CurOpen = CurOpen->Next)
1085: {
1086: if (CurOpen->AdapterBindingStatus == ADAPTER_BOUND && NPF_EqualAdapterName(&CurOpen->AdapterName, pAdapterName) == TRUE)
1087: {
1088: NdisReleaseSpinLock(&g_OpenArrayLock);
1089: return NPF_DuplicateOpenObject(CurOpen, DeviceExtension);
SYMBOL_NAME: npf!NPF_GetCopyFromOpenArray+22
FOLLOWUP_NAME: MachineOwner
DEBUG_FLR_IMAGE_TIMESTAMP: 0
IMAGE_VERSION: 0.6.0.301
MODULE_NAME: Unknown_Module
IMAGE_NAME: Unknown_Image
BUCKET_ID: CORRUPT_MODULELIST_AV
PRIMARY_PROBLEM_CLASS: CORRUPT_MODULELIST
FAILURE_BUCKET_ID: CORRUPT_MODULELIST_AV
TARGET_TIME: 2016-03-18T01:43:34.000Z
OSBUILD: 10586
OSSERVICEPACK: 0
SERVICEPACK_NUMBER: 0
OS_REVISION: 0
SUITE_MASK: 272
PRODUCT_TYPE: 1
OSPLATFORM_TYPE: x64
OSNAME: Windows 10
OSEDITION: Windows 10 WinNt TerminalServer SingleUserTS
OS_LOCALE:
USER_LCID: 0
OSBUILD_TIMESTAMP: unknown_date
BUILDDATESTAMP_STR: 160126-1819
BUILDLAB_STR: th2_release
BUILDOSVER_STR: 10.0.10586.103.amd64fre.th2_release.160126-1819
ANALYSIS_SESSION_ELAPSED_TIME: 18d9
ANALYSIS_SOURCE: KM
FAILURE_ID_HASH_STRING: km:corrupt_modulelist_av
FAILURE_ID_HASH: {fc259191-ef0c-6215-476f-d32e5dcaf1b7}
Followup: MachineOwner
---------
错误的源代码在这里:https://github.com/nmap/npcap/blob/master/packetWin7/npf/npf/Openclos.c
我知道 NdisAcquireSpinLock
调用会将 IRQL 提高到 Dispatch_LEVEL
。 WinDbg 似乎说 g_arrOpen
位于可分页内存中,不允许在 Dispatch_LEVEL
中访问。然而,事实是,g_arrOpen
是一个指向 OPEN_INSTANCE
结构的全局变量。 OPEN_INSTANCE
个实例分配在非分页池中。全局变量与驱动程序映像共存,因此它也不能被调出。
所以我不知道这里有什么问题?有什么帮助吗?谢谢!
全局变量不是问题所在。首先,请注意 Arg3
设置了执行位,即换出的内存是代码,而不是数据。您可以通过注意 READ_ADDRESS
和 FAULTING_IP
相同来确认这一点。
那么,让我们更仔细地看一下该代码:
> 1084: for (CurOpen = g_arrOpen; CurOpen != NULL; CurOpen = CurOpen->Next)
1085: {
1086: if (CurOpen->AdapterBindingStatus == ADAPTER_BOUND && NPF_EqualAdapterName(&CurOpen->AdapterName, pAdapterName) == TRUE)
这是一个发布版本,所以你不能把指示的行太当回事;然而,问题很可能就在附近。可执行数据的页面错误表明函数调用错误,所以让我们从查看 NPF_EqualAdapterName
:
开始
BOOLEAN
NPF_EqualAdapterName(
PNDIS_STRING s1,
PNDIS_STRING s2
)
{
// return RtlEqualMemory(s1->Buffer, s2->Buffer, s2->Length);
// We use RtlEqualUnicodeString because it's case-insensitive. However, verifier will complain about this call because it's under DISPATCH_LEVEL.
// Just don't enable the IRQL switch when testing with verifier.
return RtlEqualUnicodeString(s1, s2, TRUE);
}
一个非常短的函数,几乎肯定是内联的,所以它不一定会出现在堆栈跟踪中。这导致我们调用 RtlEqualUnicodeString
,当我们 check the documentation 结果需要 PASSIVE_LEVEL 时。答对了。 (见鬼,除了验证,我们甚至不需要查看文档,因为 评论直截了当地指出调用是非法的。)
结论:RtlEqualUnicodeString
恰好在你调用的时候被调出。
(据推测,最好的解决方案是重新使用 RtlEqualMemory
并确保您的比较字符串提前正确区分大小写。)
我有一个 NDIS Filter driver
(WinPcap 的更新)并在 Windows 10 10586 x64 VM 上对其进行了测试。我启用了验证程序,它在启动 Wireshark(也就是使用我的驱动程序的功能)时导致 IRQL_NOT_LESS_OR_EQUAL
BSoD。
这是转储:
1: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: fffff80137694a20, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000008, bitfield :
bit 0 : value 0 = read operation, 1 = write operation
bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: fffff80137694a20, address which referenced memory
Debugging Details:
------------------
***** Debugger could not find nt in module list, module list might be corrupt, error 0x80070057.
DUMP_CLASS: 1
DUMP_QUALIFIER: 400
BUILD_VERSION_STRING: 10586.103.amd64fre.th2_release.160126-1819
SYSTEM_MANUFACTURER: VMware, Inc.
VIRTUAL_MACHINE: VMware
SYSTEM_PRODUCT_NAME: VMware Virtual Platform
SYSTEM_VERSION: None
BIOS_VENDOR: Phoenix Technologies LTD
BIOS_VERSION: 6.00
BIOS_DATE: 07/02/2015
BASEBOARD_MANUFACTURER: Intel Corporation
BASEBOARD_PRODUCT: 440BX Desktop Reference Platform
BASEBOARD_VERSION: None
DUMP_TYPE: 2
BUGCHECK_P1: fffff80137694a20
BUGCHECK_P2: 2
BUGCHECK_P3: 8
BUGCHECK_P4: fffff80137694a20
READ_ADDRESS: unable to get nt!MiSessionIdBitmap
Unable to get value of nt!MiSessionWsList
fffff80137694a20
CURRENT_IRQL: 0
FAULTING_IP:
+0
fffff801`37694a20 4883ec08 sub rsp,8
CPU_COUNT: 2
CPU_MHZ: 961
CPU_VENDOR: GenuineIntel
CPU_FAMILY: 6
CPU_MODEL: 3c
CPU_STEPPING: 3
CPU_MICROCODE: 0,0,0,0 (F,M,S,R) SIG: 1E'00000000 (cache) 0'00000000 (init)
CUSTOMER_CRASH_COUNT: 1
DEFAULT_BUCKET_ID: CORRUPT_MODULELIST_AV
BUGCHECK_STR: AV
ANALYSIS_SESSION_HOST: AKISN0W-PC
ANALYSIS_SESSION_TIME: 03-18-2016 09:48:01.0434
ANALYSIS_VERSION: 10.0.10586.567 amd64fre
LAST_CONTROL_TRANSFER: from fffff801373c7fe9 to fffff801373bd480
FAILED_INSTRUCTION_ADDRESS:
+0
fffff801`37694a20 4883ec08 sub rsp,8
SYMBOL_ON_RAW_STACK: 1
STACK_ADDR_RAW_STACK_SYMBOL: ffffd0012ba372e8
STACK_COMMAND: dps ffffd0012ba372e8-0x20 ; kb
STACK_TEXT:
ffffd001`2ba372c8 fffff801`376876d6
ffffd001`2ba372d0 fffff801`3792eebe
ffffd001`2ba372d8 fffff801`372ef2c2
ffffd001`2ba372e0 fffff800`71272b02 npf!NPF_GetCopyFromOpenArray+0x22 [j:\npcap\packetwin7\npf\npf\openclos.c @ 1084]
ffffd001`2ba372e8 fffff800`71272ec5 npf!NPF_OpenAdapter+0x2d [j:\npcap\packetwin7\npf\npf\openclos.c @ 258]
ffffd001`2ba372f0 00000000`00000000
ffffd001`2ba372f8 00000000`00000000
ffffd001`2ba37300 00000000`00000000
ffffd001`2ba37308 00000000`00000000
ffffd001`2ba37310 fffff801`37694a20
ffffd001`2ba37318 fffff800`71272ec5 npf!NPF_OpenAdapter+0x2d [j:\npcap\packetwin7\npf\npf\openclos.c @ 258]
ffffd001`2ba37320 fffff801`3792eebe
ffffd001`2ba37328 fffff801`372ef2c2
ffffd001`2ba37330 fffff801`37690d68
ffffd001`2ba37338 fffff801`376876d6
ffffd001`2ba37340 fffff801`376860dc
FOLLOWUP_IP:
npf!NPF_GetCopyFromOpenArray+22 [j:\npcap\packetwin7\npf\npf\openclos.c @ 1084]
fffff800`71272b02 488b1d177e0000 mov rbx,qword ptr [npf!g_arrOpen (fffff800`7127a920)]
FAULT_INSTR_CODE: 171d8b48
FAULTING_SOURCE_LINE: j:\npcap\packetwin7\npf\npf\openclos.c
FAULTING_SOURCE_FILE: j:\npcap\packetwin7\npf\npf\openclos.c
FAULTING_SOURCE_LINE_NUMBER: 1084
FAULTING_SOURCE_CODE:
1080: POPEN_INSTANCE CurOpen;
1081: TRACE_ENTER();
1082:
1083: NdisAcquireSpinLock(&g_OpenArrayLock);
> 1084: for (CurOpen = g_arrOpen; CurOpen != NULL; CurOpen = CurOpen->Next)
1085: {
1086: if (CurOpen->AdapterBindingStatus == ADAPTER_BOUND && NPF_EqualAdapterName(&CurOpen->AdapterName, pAdapterName) == TRUE)
1087: {
1088: NdisReleaseSpinLock(&g_OpenArrayLock);
1089: return NPF_DuplicateOpenObject(CurOpen, DeviceExtension);
SYMBOL_NAME: npf!NPF_GetCopyFromOpenArray+22
FOLLOWUP_NAME: MachineOwner
DEBUG_FLR_IMAGE_TIMESTAMP: 0
IMAGE_VERSION: 0.6.0.301
MODULE_NAME: Unknown_Module
IMAGE_NAME: Unknown_Image
BUCKET_ID: CORRUPT_MODULELIST_AV
PRIMARY_PROBLEM_CLASS: CORRUPT_MODULELIST
FAILURE_BUCKET_ID: CORRUPT_MODULELIST_AV
TARGET_TIME: 2016-03-18T01:43:34.000Z
OSBUILD: 10586
OSSERVICEPACK: 0
SERVICEPACK_NUMBER: 0
OS_REVISION: 0
SUITE_MASK: 272
PRODUCT_TYPE: 1
OSPLATFORM_TYPE: x64
OSNAME: Windows 10
OSEDITION: Windows 10 WinNt TerminalServer SingleUserTS
OS_LOCALE:
USER_LCID: 0
OSBUILD_TIMESTAMP: unknown_date
BUILDDATESTAMP_STR: 160126-1819
BUILDLAB_STR: th2_release
BUILDOSVER_STR: 10.0.10586.103.amd64fre.th2_release.160126-1819
ANALYSIS_SESSION_ELAPSED_TIME: 18d9
ANALYSIS_SOURCE: KM
FAILURE_ID_HASH_STRING: km:corrupt_modulelist_av
FAILURE_ID_HASH: {fc259191-ef0c-6215-476f-d32e5dcaf1b7}
Followup: MachineOwner
---------
错误的源代码在这里:https://github.com/nmap/npcap/blob/master/packetWin7/npf/npf/Openclos.c
我知道 NdisAcquireSpinLock
调用会将 IRQL 提高到 Dispatch_LEVEL
。 WinDbg 似乎说 g_arrOpen
位于可分页内存中,不允许在 Dispatch_LEVEL
中访问。然而,事实是,g_arrOpen
是一个指向 OPEN_INSTANCE
结构的全局变量。 OPEN_INSTANCE
个实例分配在非分页池中。全局变量与驱动程序映像共存,因此它也不能被调出。
所以我不知道这里有什么问题?有什么帮助吗?谢谢!
全局变量不是问题所在。首先,请注意 Arg3
设置了执行位,即换出的内存是代码,而不是数据。您可以通过注意 READ_ADDRESS
和 FAULTING_IP
相同来确认这一点。
那么,让我们更仔细地看一下该代码:
> 1084: for (CurOpen = g_arrOpen; CurOpen != NULL; CurOpen = CurOpen->Next)
1085: {
1086: if (CurOpen->AdapterBindingStatus == ADAPTER_BOUND && NPF_EqualAdapterName(&CurOpen->AdapterName, pAdapterName) == TRUE)
这是一个发布版本,所以你不能把指示的行太当回事;然而,问题很可能就在附近。可执行数据的页面错误表明函数调用错误,所以让我们从查看 NPF_EqualAdapterName
:
BOOLEAN
NPF_EqualAdapterName(
PNDIS_STRING s1,
PNDIS_STRING s2
)
{
// return RtlEqualMemory(s1->Buffer, s2->Buffer, s2->Length);
// We use RtlEqualUnicodeString because it's case-insensitive. However, verifier will complain about this call because it's under DISPATCH_LEVEL.
// Just don't enable the IRQL switch when testing with verifier.
return RtlEqualUnicodeString(s1, s2, TRUE);
}
一个非常短的函数,几乎肯定是内联的,所以它不一定会出现在堆栈跟踪中。这导致我们调用 RtlEqualUnicodeString
,当我们 check the documentation 结果需要 PASSIVE_LEVEL 时。答对了。 (见鬼,除了验证,我们甚至不需要查看文档,因为 评论直截了当地指出调用是非法的。)
结论:RtlEqualUnicodeString
恰好在你调用的时候被调出。
(据推测,最好的解决方案是重新使用 RtlEqualMemory
并确保您的比较字符串提前正确区分大小写。)