在取到这个DUMP之后,就是先看下系统给出的分析结果
0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
DRIVER_POWER_STATE_FAILURE (9f)
A driver has failed to complete a power IRP within a specific time.
Arguments:
Arg1: 00000004, The power transition timed out waiting to synchronize with the Pnp
subsystem.
Arg2: 00000258, Timeout in seconds.
Arg3: 86e1aa70, The thread currently holding on to the Pnp lock.
Arg4: 84579b24, nt!TRIAGE_9F_PNP on Win7 and higher
Debugging Details:
------------------
Implicit thread is now 86e1aa70
DRVPOWERSTATE_SUBCODE: 4
IMAGE_NAME: vwifibus.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 4a5bc8a2
MODULE_NAME: vwifibus
FAULTING_MODULE: 9ad2e000 vwifibus
DEFAULT_BUCKET_ID: WIN7_DRIVER_FAULT
BUGCHECK_STR: 0x9F
PROCESS_NAME: System
CURRENT_IRQL: 2
ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) x86fre
STACK_TEXT:
84579b04 8452ddd7 0000009f 00000004 00000258 nt!KeBugCheckEx+0x1e
84579b30 8477cfb5 84579b7c 844cb4d9 9107aab0 nt!PnpBugcheckPowerTimeout+0x57
84579b38 844cb4d9 9107aab0 9107aad0 4b4c4c88 nt!PopBuildDeviceNotifyListWatchdog+0x16
84579b7c 844cb47d 8457cd20 84579ca8 00000005 nt!KiProcessTimerDpcTable+0x50
84579c68 844cb33a 8457cd20 84579ca8 00000000 nt!KiProcessExpiredTimerList+0x101
84579cdc 844c94ce 0000c8b8 87013d48 84586380 nt!KiTimerExpiration+0x25c
84579d20 844c92f8 00000000 0000000e 00000000 nt!KiRetireDpcList+0xcb
84579d24 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x38
STACK_COMMAND: kb
FOLLOWUP_NAME: MachineOwner
FAILURE_BUCKET_ID: 0x9F_4_vwifimp_IMAGE_vwifibus.sys
BUCKET_ID: 0x9F_4_vwifimp_IMAGE_vwifibus.sys
ANALYSIS_SOURCE: KM
FAILURE_ID_HASH_STRING: km:0x9f_4_vwifimp_image_vwifibus.sys
FAILURE_ID_HASH: {2235b6b3-c7f5-ad64-545b-e209e3ae3930}
Followup: MachineOwner
---------
注意上面标出来的几个字段第一个为4表示该系统为Windows Vista之后的操作系统,第二个0x258是超时的时间,单位为秒,第三个就是当前拥有PNP锁的线程。之后就是WINDBG给出的分析结果是 vwifibus.sys这个驱动。由于这个DUMP是系统自主产生的,说明系统对这方面的错误有觉误。换句话来说就是系统产生的这份DUMP里面的信息还是很值得信赖。
根据系统给的线索,我们来找下更多的信息。既然是锁的问题,那我们看下是不是存在死锁。 0: kd> !locks
**** DUMP OF ALL RESOURCE OBJECTS ****
KD: Scanning for held locks..
Resource @ nt!IopDeviceTreeLock (0x845b8c60) Shared 1 owning threads
Threads: 86e1aa70-01<*>
KD: Scanning for held locks.
Resource @ nt!PiEngineLock (0x845b8be0) Exclusively owned
Contention Count = 12
NumberOfExclusiveWaiters = 1
Threads: 86e1aa70-01<*>
Threads Waiting On Exclusive Access:
89024a58
KD: Scanning for held locks................................................................................................................................................................................................................................................................................................................................................................................................................
12834 total locks, 2 locks currently held
从上面的信息来看,当前系统确实存在有死锁的情况,我们注意下0x845b8be0这个锁我们看下更详细的信息
0: kd> !locks -v 0x845b8be0
Resource @ nt!PiEngineLock (0x845b8be0) Exclusively owned
Contention Count = 12
NumberOfExclusiveWaiters = 1
Threads: 86e1aa70-01<*>
THREAD 86e1aa70 Cid 0004.002c Teb: 00000000 Win32Thread: 00000000 WAIT: (DelayExecution) KernelMode Non-Alertable
8920e3ec NotificationEvent
IRP List:
86ff5a68: (0006,0100) Flags: 00000000 Mdl: 00000000
Not impersonating
DeviceMap 8cc08898
Owning Process 86d837e0 Image: System
Attached Process N/A Image: N/A
Wait Start TickCount 51383 Ticks: 1 (0:00:00:00.015)
Context Switch Count 20312 IdealProcessor: 0 NoStackSwap
UserTime 00:00:00.000
KernelTime 00:00:00.109
Win32 Start Address nt!ExpWorkerThread (0x844cf03e)
Stack Init 8f579fd0 Current 8f579770 Base 8f57a000 Limit 8f577000 Call 0
Priority 14 BasePriority 12 UnusualBoost 2 ForegroundBoost 0 IoPriority 2 PagePriority 5
ChildEBP RetAddr
8f579788 844cfd3d nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])
8f5797c0 844ceb9b nt!KiSwapThread+0x266
8f5797e8 844ce879 nt!KiCommitThreadWait+0x1df
8f579844 9407fa7f nt!KeDelayExecutionThread+0x2aa
WARNING: Stack unwind information not available. Following frames may be wrong.
8f579868 8c0cdd91 XXXX+0x1a7f
8f5798ac 8c0b9556 ndis!ndisPauseFilter+0x196 (FPO: [Non-Fpo])
8f5798c8 8c130be4 ndis!ndisPauseFilterStack+0x57 (FPO: [Non-Fpo])
8f57995c 8c1356ab ndis!ndisCloseMiniportBindings+0xc4 (FPO: [Non-Fpo])
8f579a8c 8c0e5545 ndis!ndisPnPRemoveDevice+0x23c (FPO: [Non-Fpo])
8f579aac 8c129f7c ndis!ndisPnPRemoveDeviceEx+0x87 (FPO: [Non-Fpo])
8f579af0 84488c1e ndis!ndisPnPDispatch+0x888 (FPO: [Non-Fpo])
8f579b08 8462b085 nt!IofCallDriver+0x63
8f579b3c 84717de8 nt!IopSynchronousCall+0xba
8f579b98 8470fc9e nt!IopRemoveDevice+0xd5
8f579bc0 8470fb27 nt!PnpSurpriseRemoveLockedDeviceNode+0x101
8f579bd0 8470fde7 nt!PnpDeleteLockedDeviceNode+0x21
8f579c04 847133fd nt!PnpDeleteLockedDeviceNodes+0x4c
8f579cc4 84603390 nt!PnpProcessQueryRemoveAndEject+0x586
8f579cdc 84611492 nt!PnpProcessTargetDeviceEvent+0x38
8f579d00 844cf14b nt!PnpDeviceEventWorker+0x216
8f579d50 8465b13d nt!ExpWorkerThread+0x10d
8f579d90 84502559 nt!PspSystemThreadStartup+0x9e
00000000 00000000 nt!KiThreadStartup+0x19
Threads Waiting On Exclusive Access:
89024a58
1 total locks, 1 locks currently held
从上面的栈来看,系统正在移除一个和网络相关的设备,在移除的时候系统先加了个锁然后以同步的方式去移除和解绑该设备,到最后就调用KeDelayExecutionThread这个函数来等待这个操作完成。我们也可以看到这个线程此时的状态也是在等待执行。我们注意到最下面有这么两行信息
Threads Waiting On Exclusive Access:
89024a58
这表明了此时此刻线程 89024a58也想去拥有当前这个锁,但是由于上面的线程同步没有完成,迟迟不释放该锁。我们看下这个等待这个锁的线程的栈回溯是什么
0: kd> !thread 89024a58
THREAD 89024a58 Cid 0220.0224 Teb: 7ffdf000 Win32Thread: ffb5c540 WAIT: (WrResource) KernelMode Non-Alertable
87ac0838 SynchronizationEvent
Not impersonating
DeviceMap 8cc08898
Owning Process 89024d40 Image: wininit.exe
Attached Process N/A Image: N/A
Wait Start TickCount 51261 Ticks: 123 (0:00:00:01.918)
Context Switch Count 570 IdealProcessor: 0
UserTime 00:00:00.000
KernelTime 00:00:00.078
Win32 Start Address 0x00954dc8
Stack Init 9107afd0 Current 9107a878 Base 9107b000 Limit 91078000 Call 0
Priority 15 BasePriority 15 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
ChildEBP RetAddr Args to Child
9107a890 844cfd3d 89024a58 00000000 8457cd20 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])
9107a8c8 844ceb9b 89024b18 89024a58 87ac0838 nt!KiSwapThread+0x266
9107a8f0 844c858f 89024a58 89024b18 0000003e nt!KiCommitThreadWait+0x1df
9107a96c 8448717e 87ac0838 0000001b 00000000 nt!KeWaitForSingleObject+0x393
9107a9c4 844d2b35 87ac0838 9107aa88 00000000 nt!ExpWaitForResource+0x16f
9107aa10 8462f1c1 845b8be0 00000001 9107aa88 nt!ExAcquireResourceExclusiveLite+0x1cf
9107aa34 84466a63 00000001 9107aa88 86fd61d8 nt!PpDevNodeLockTree+0xd9
9107aa50 84773385 9107aa88 00000000 00000000 nt!PnpLockDeviceActionQueue+0x13
9107aa74 84773332 86fd61d8 000001a8 86fd61b8 nt!IoBuildPoDeviceNotifyList+0x1c
9107aae4 8451c46b 86fd61d8 9107ab78 00000006 nt!PopBuildDeviceNotifyList+0x6e
9107ab10 84774fc7 9107abdc 9107ac60 83cf1dbc nt!PoInitializeBroadcast+0x89
9107abc8 8448f8c6 00000006 00000004 c0000004 nt!NtSetSystemPowerState+0x17d
9107abc8 8448e97d 00000006 00000004 c0000004 nt!KiSystemServicePostCall (FPO: [0,3] TrapFrame @ 9107abdc)
9107ac4c 84774f2c 00000006 00000004 c0000004 nt!ZwSetSystemPowerState+0x11 (FPO: [3,0,0])
9107ad14 84755c4b 00000006 00000004 c0000004 nt!NtSetSystemPowerState+0xe2
9107ad28 8448f8c6 00000002 000bfe90 77c270f4 nt!NtShutdownSystem+0x32
9107ad28 77c270f4 00000002 000bfe90 77c270f4 nt!KiSystemServicePostCall (FPO: [0,3] TrapFrame @ 9107ad34)
WARNING: Frame IP not in any known module. Following frames may be wrong.
000bfe90 00000000 00000000 00000000 00000000 0x77c270f4
我们看到这个线程正在执行一个关机的动作,我们注意到NtShutdownSystem这个函数的第一个参数为2表明是POWEROFF的操作
ExAcquireResourceExclusiveLite这个函数的第一个参数为845b8be0 刚好是在等待那个被占用的资源就是说wininit.exe进程在关机的操作下走到关闭电源这里,然后系统也要请求那个锁,那个锁刚好被一直占用中没有释放,这里拿不到那个锁,在等待了0x258秒(10分钟之后)系统认为出了问题,然后就自已触发了蓝屏。
好了,到这里我们就明白了整个蓝屏的流程:用户在开着AP的时候,直接关机,之后就蓝屏。蓝屏是因为那个AP还没有卸载,然后在用户关机的时候,系统去回收这个资源,由于这个AP是vwifimp.sys创建出来的,系统在关机的时候它会去把这个设备先移除掉,所以会触发到PNP管理器,PNP管理器就会去处理这个移除的请求。然后这个请求会请求一个资源锁来保证同步卸载这个虚拟出来的AP。但是在卸载的时候不知道什么原因vwifimp.sys发现这个移除的IRP请求没有回应一直卡在那里,同时系统要响应关机这个动作,关机的时候进程wininit.exe它会去对所有的PNP设备广播,表明系统要准备关机了,之后申请这个资源锁,刚好这个锁被vwifimp.sys这个驱动占有,于是用户的界面就是显示一直在关机,在等了10分钟之后,系统关机的线程还是没有得到这个锁的资源,于是就自己触发蓝屏表明系统出异常了。
到这里有必要解释下nt!PiEngineLock 这个锁有什么用,这个锁是一个公共的锁,专门服务于PNP相关和电源相关的用于同步的操作。正因为这一特性所以用户那边才出现蓝屏。
关于蓝屏,有人些不了解了的会以为蓝屏会导致硬件损坏啥的,我只想说这都是扯蛋。蓝屏是系统运行良好的一种表现,能蓝屏说明系统对这个错误还是有警觉的,说明系统能够察觉到这种错误对系统的影响,如果系统不蓝屏了那说明你的系统真的是不正常了。