简单现象:鼠标可以移动,别的没有反应。
驱动层发生死机的主要原因有:
1.各种内核态死锁。
2.进入ISR或者DPC没有返回。本程序不会发生这种情况。
3.高优先级的线程抢占了窗口系统驱动程序的原始输入线程。本程序不会发生这种情况。
4.其他,请你告诉我。
准备工作:
首先配置生成核心内存转储文件,具体的设置见:
http://msdn.microsoft.com/en-us/library/windows/hardware/ff542953(v=vs.85).aspx
啰嗦一点:
一定要在系统盘(别的也可以,但要另外设置)上设置一些大的虚拟内存,以便存下核心存储文件。
一定要是核心内存转储文件,小存储类型的是没有死机信息的。
生成蓝屏转储文件。
在这种情况下生成蓝屏转储文件有:
1.提前进行双机调试或者在发生死机的时候,再附加上。都需要运行.crash命令。
2.利用键盘的驱动程序的功能,具体的见:
http://msdn.microsoft.com/en-us/library/windows/hardware/ff545499(v=vs.85).aspx 或
http://msdn.microsoft.com/zh-cn/library/ff545499.aspx
3.还有一种不常见的办法:利用不可屏蔽中断(NMI),也需要硬件和修改注册表。
开始分析:
1.在windbg激活的情况下按Ctrl+PauseBreak进入中断状态。如果是分析的dump文件,不要这一步。
2.输入:!locks,显示如下:
0: kd> !locks
**** DUMP OF ALL RESOURCE OBJECTS ****
KD: Scanning for held locks.......
Resource @ Ntfs!NtfsData (0xf7b6d5b0) Shared 1 owning threads
Threads: 8b1a68d0-01<*>
KD: Scanning for held locks.........................................
Resource @ 0x879bdc2c Shared 1 owning threads
Contention Count = 7856
NumberOfSharedWaiters = 18
NumberOfExclusiveWaiters = 1
Threads: 8acf2020-01 8ace13f0-01 89d4c440-01<*> 8aa60a98-01
8b1a63f0-01 88e32020-01 8ab04020-01 8866ea60-01
8866e020-01 8a194b78-01 89d50508-01 8ac96b08-01
8866f958-01 87e5f300-01 8aa796f8-01 88672d08-01
8866e7f0-01 88654020-01 8ac76020-01
Threads Waiting On Exclusive Access:
8b1a6b40
KD: Scanning for held locks.
Resource @ 0x8ac04378 Exclusively owned
Contention Count = 91
NumberOfSharedWaiters = 2
Threads: 89d4c440-01<*> 8b1a6db0-01 8b1a7738-01
KD: Scanning for held locks.
Resource @ 0x8abfcf58 Shared 1 owning threads
Contention Count = 7
NumberOfSharedWaiters = 1
NumberOfExclusiveWaiters = 1
Threads: 8b1a63f0-01<*> 8b1a68d0-01
Threads Waiting On Exclusive Access:
89d4c440
KD: Scanning for held locks.........................
Resource @ 0x894ac888 Exclusively owned
Threads: 89d4c440-01<*>
Resource @ 0x894acda8 Exclusively owned
Contention Count = 2
NumberOfSharedWaiters = 1
Threads: 89d4c440-01<*> 8b1a79a8-01
KD: Scanning for held locks.....
Resource @ 0x883d0bc0 Shared 1 owning threads
Contention Count = 24547
Threads: 89d4c440-03<*>
KD: Scanning for held locks....................
Resource @ 0x889e2ae0 Exclusively owned
Threads: 8b1a63f0-01<*>
Resource @ 0x894af688 Exclusively owned
Threads: 8b1a63f0-01<*>
Resource @ 0x8ae07eb8 Exclusively owned
Threads: 8b1a63f0-01<*>
Resource @ 0x8acc6598 Exclusively owned
Threads: 8b1a63f0-01<*>
Resource @ 0x8ac969c8 Exclusively owned
Threads: 8b1a63f0-01<*>
KD: Scanning for held locks.
Resource @ 0x8ace8e08 Exclusively owned
Threads: 8b1a63f0-01<*>
KD: Scanning for held locks.
Resource @ 0x892138e8 Exclusively owned
Threads: 8b1a63f0-01<*>
Resource @ 0x8aa601e8 Exclusively owned
Threads: 8b1a63f0-01<*>
Resource @ 0x88efc580 Exclusively owned
Threads: 8b1a63f0-01<*>
Resource @ 0x879bbdf0 Exclusively owned
Threads: 8b1a63f0-01<*>
Resource @ 0x884c1618 Exclusively owned
Threads: 8b1a63f0-01<*>
Resource @ 0x8acc5758 Exclusively owned
Threads: 8b1a63f0-01<*>
Resource @ 0x89365318 Exclusively owned
Threads: 8b1a63f0-01<*>
Resource @ 0x8a18d3b0 Exclusively owned
Threads: 8b1a63f0-01<*>
Resource @ 0x879c5318 Exclusively owned
Threads: 8b1a63f0-01<*>
Resource @ 0x884c5ad8 Exclusively owned
Threads: 8b1a63f0-01<*>
KD: Scanning for held locks.......
3329 total locks, 23 locks currently held
主要就是对这个进行分析,要仔细分析,我想大家都知道怎么分析的,都能分析出死锁的线程。
说明:
1.带*是线程的占有者,还有等待者。
2.驱动程序验证程序是检查不到执行体资源的死锁的。
3.本程序看样子是执行体资源引起的死锁,但是执行体资源不是本驱动的,而是文件系统内部的。
经过分析,怀疑一下线程有问题:8b1a63f0和89d4c440,所以查看下:
0: kd> !thread 8b1a63f0
THREAD 8b1a63f0 Cid 0004.0030 Teb: 00000000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Non-Alertable
8b2e80b0 Semaphore Limit 0x7fffffff
8b1a6468 NotificationTimer
Not impersonating
DeviceMap e10018f0
Owning Process 8b1a89e8 Image: System
Attached Process N/A Image: N/A
Wait Start TickCount 18959 Ticks: 186 (0:00:00:02.906)
Context Switch Count 5288
UserTime 00:00:00.000
KernelTime 00:00:01.343
Start Address nt!ExpWorkerThread (0x80880356)
Stack Init f78d7000 Current f78d6430 Base f78d7000 Limit f78d4000 Call 0
Priority 14 BasePriority 13 PriorityDecrement 1
ChildEBP RetAddr Args to Child
f78d6448 80833465 8b1a63f0 8b1a6498 00000001 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])
f78d6474 80829a62 8b1a63f0 879bdc2c 00000000 nt!KiSwapThread+0x2e5 (FPO: [0,7,0])
f78d64bc 8087cbed 8b2e80b0 0000001b 00000000 nt!KeWaitForSingleObject+0x346 (FPO: [5,13,4])
f78d64f8 8087d02f f78d6890 f78d6608 f78d6608 nt!ExpWaitForResource+0xd5 (FPO: [0,5,4])
f78d6518 f7b9107d 879bdc2c 00000001 00000006 nt!ExAcquireResourceSharedLite+0xf5 (FPO: [2,3,4])
f78d652c f7b78290 f78d6608 879bd7f8 00000001 Ntfs!NtfsAcquireSharedVcb+0x23 (FPO: [3,0,4])
f78d6590 f7b8aff6 f78d6608 8851d4d0 80a5bf00 Ntfs!NtfsCommonQueryInformation+0xd2 (FPO: [SEH])
f78d65f4 f7b8b02f f78d6608 8851d4d0 00000001 Ntfs!NtfsFsdDispatchSwitch+0x12a (FPO: [SEH])
f78d6710 809b550c 879bd718 8851d4d0 89d79270 Ntfs!NtfsFsdDispatchWait+0x1c (FPO: [2,66,0])
f78d6740 8081df33 f7272c45 f78d6774 f7272c45 nt!IovCallDriver+0x112 (FPO: [1,5,0])
f78d674c f7272c45 89d79270 80a5bf00 ffffffff nt!IofCallDriver+0x13 (FPO: [0,0,0])
f78d6774 809b550c 89d79270 8851d4d0 b5e7bab8 fltMgr!FltpDispatch+0x6f (FPO: [2,6,0])
f78d67a4 8081df33 b5e7472e f78d67d8 b5e7472e nt!IovCallDriver+0x112 (FPO: [1,5,0])
f78d67b0 b5e7472e 8851d4d0 80a5c456 00000000 nt!IofCallDriver+0x13 (FPO: [0,0,0])
f78d67d8 b5e74aef 89d79270 88bbd038 00000006 webshield!FilemonQueryFile+0xce (FPO: [Non-Fpo]) (CONV: stdcall) [d:\webshield\wgsclient\filemon\filemon\filemon.c @ 845]
f78d68d8 b5e7812a 00000000 88bbd038 87895f18 webshield!FilemonGetFullPath+0x23f (FPO: [Non-Fpo]) (CONV: stdcall) [d:\webshield\wgsclient\filemon\filemon\filemon.c @ 939]
f78d6ab0 b5e7964f 87895e60 8aaf8208 8aaf8208 webshield!FilemonHookRoutine+0x1da (FPO: [Non-Fpo]) (CONV: stdcall) [d:\webshield\wgsclient\filemon\filemon\filemon.c @ 2279]
f78d6ac4 809b550c 87895e60 8aaf8208 88bbd038 webshield!FilemonDispatch+0x2f (FPO: [Non-Fpo]) (CONV: stdcall) [d:\webshield\wgsclient\filemon\filemon\filemon.c @ 2736]
f78d6af4 8081df33 8081e67f f78d6b14 8081e67f nt!IovCallDriver+0x112 (FPO: [1,5,0])
f78d6b00 8081e67f 00000000 f78d6b3c 8ac24ac8 nt!IofCallDriver+0x13 (FPO: [0,0,0])
f78d6b14 80836426 88bbd00b f78d6b3c f78d6c04 nt!IoSynchronousPageWrite+0xaf (FPO: [5,0,4])
f78d6c30 8083780b e3ebb430 e3ebb4b0 8ac24ac8 nt!MiFlushSectionInternal+0x6ba (FPO: [6,61,4])
f78d6c74 8080f8de 8ac24a90 f78d6c00 00010000 nt!MmFlushSection+0x211 (FPO: [5,6,0])
f78d6cfc 8080fc57 00010000 00000000 00000001 nt!CcFlushCache+0x3a6 (FPO: [4,24,4])
f78d6d40 808127a2 8b1a63f0 808ae5c0 8b19d190 nt!CcWriteBehind+0x11b (FPO: [0,8,4])
f78d6d80 80880441 8b19d190 00000000 8b1a63f0 nt!CcWorkerThread+0x15a (FPO: [SEH])
f78d6dac 80949b7c 8b19d190 00000000 00000000 nt!ExpWorkerThread+0xeb (FPO: [1,5,0])
f78d6ddc 8088e062 80880356 00000000 00000000 nt!PspSystemThreadStartup+0x2e (FPO: [SEH])
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16
0: kd> !thread 89d4c440
THREAD 89d4c440 Cid 0b9c.0ba0 Teb: 7ffdd000 Win32Thread: e118c610 WAIT: (Unknown) KernelMode Non-Alertable
8acf4428 SynchronizationEvent
89d4c4b8 NotificationTimer
IRP List:
89d5c878: (0006,01fc) Flags: 40000404 Mdl: 00000000
Not impersonating
DeviceMap e32abaf8
Owning Process 89d4c020 Image: cmd.exe
Attached Process N/A Image: N/A
Wait Start TickCount 18959 Ticks: 186 (0:00:00:02.906)
Context Switch Count 178844 LargeStack
UserTime 00:00:00.250
KernelTime 00:00:03.640
Win32 Start Address 0x4ad07670
Start Address 0x7c8217f8
Stack Init b663d000 Current b663c49c Base b663d000 Limit b6639000 Call 0
Priority 14 BasePriority 8 PriorityDecrement 1
ChildEBP RetAddr Args to Child
b663c4b4 80833465 89d4c440 89d4c4e8 00000001 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])
b663c4e0 80829a62 89d4c440 8abfcf58 00000000 nt!KiSwapThread+0x2e5 (FPO: [0,7,0])
b663c528 8087cbed 8acf4428 0000001b 00000000 nt!KeWaitForSingleObject+0x346 (FPO: [5,13,4])
b663c564 8087ce07 00000000 e2188008 b663c870 nt!ExpWaitForResource+0xd5 (FPO: [0,5,4])
b663c584 f7b515b4 8abfcf58 b663c801 b663c5b8 nt!ExAcquireResourceExclusiveLite+0x8d (FPO: [2,3,0])
b663c594 f7b8e3b1 b663c870 e2188008 b663c801 Ntfs!NtfsAcquireResourceExclusive+0x20 (FPO: [3,0,0])
b663c5b8 f7b90d9d b663c801 e2188008 e21880d0 Ntfs!NtfsAcquireExclusiveFcb+0x42 (FPO: [4,1,4])
b663c5d4 f7b7bfce b663c870 e21880d0 e31c29a8 Ntfs!NtfsAcquireExclusiveScb+0x17 (FPO: [2,0,4])
b663c658 f7b9e8fa b663c870 00000000 b663c978 Ntfs!NtfsWriteUsnJournalChanges+0x71 (FPO: [SEH])
b663c854 f7b928d9 b663c870 89d5c878 80a5bf00 Ntfs!NtfsCommonCleanup+0x21ff (FPO: [SEH])
b663c9c4 809b550c 879bd718 89d5c878 89d79270 Ntfs!NtfsFsdCleanup+0xcf (FPO: [SEH])
b663c9f4 8081df33 f7272c45 b663ca28 f7272c45 nt!IovCallDriver+0x112 (FPO: [1,5,0])
b663ca00 f7272c45 89d79270 80a5bf00 ffffffff nt!IofCallDriver+0x13 (FPO: [0,0,0])
b663ca28 809b550c 89d79270 89d5c878 89d5ca50 fltMgr!FltpDispatch+0x6f (FPO: [2,6,0])
b663ca58 8081df33 b5e792e0 b663cc2c b5e792e0 nt!IovCallDriver+0x112 (FPO: [1,5,0])
b663ca64 b5e792e0 80a5bf00 87895e60 ffffffff nt!IofCallDriver+0x13 (FPO: [0,0,0])
b663cc2c b5e7964f 87895e60 89d5c878 89d5c878 webshield!FilemonHookRoutine+0x1390 (FPO: [Non-Fpo]) (CONV: stdcall) [d:\webshield\wgsclient\filemon\filemon\filemon.c @ 2663]
b663cc40 809b550c 87895e60 89d5c878 88b07318 webshield!FilemonDispatch+0x2f (FPO: [Non-Fpo]) (CONV: stdcall) [d:\webshield\wgsclient\filemon\filemon\filemon.c @ 2736]
b663cc70 8081df33 808f9732 b663ccac 808f9732 nt!IovCallDriver+0x112 (FPO: [1,5,0])
b663cc7c 808f9732 88b07300 8b18f730 88b07318 nt!IofCallDriver+0x13 (FPO: [0,0,0])
b663ccac 80934bac 89d4c020 87895e60 00120196 nt!IopCloseFile+0x2ae (FPO: [5,7,0])
b663ccdc 809344ad 89d4c020 00000001 8b18f730 nt!ObpDecrementHandleCount+0xcc (FPO: [4,2,4])
b663cd04 80934546 e32a9a58 88b07318 00000018 nt!ObpCloseHandleTableEntry+0x131 (FPO: [5,1,0])
b663cd48 80934663 00000018 00000001 b663cd64 nt!ObpCloseHandle+0x82 (FPO: [2,7,4])
b663cd58 8088978c 00000018 0012fafc 7c9585ec nt!NtClose+0x1b (FPO: [1,0,0])
b663cd58 7c9585ec 00000018 0012fafc 7c9585ec nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ b663cd64)
WARNING: Stack unwind information not available. Following frames may be wrong.
0012fafc 00000000 00000000 00000000 00000000 ntdll+0x285ec
从上面已经看出一些苗头,两个线程都在调用IofCallDriver。
具体原因就不说了,看原代码,提一点,是IRP回调里面又新建IRP下发了。
如果想更深一步的查看,可以进行:
看到thread 89d4c440在处理一个IRP,查看下:
0: kd> !irp 89d5c878
Irp is active with 11 stacks 9 is current (= 0x89d5ca08)
No Mdl: No System Buffer: Thread 89d4c440: Irp stack trace.
cmd *** cl Device File Completion-Context
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
[ 0, 0] 0 10 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
>[ 12, 0] 0 e0 879bd718 88b07318 809c7ef4-89d5ca2c Success Error Cancel
\FileSystem\Ntfs nt!IovpInternalCompletionTrap
Args: 00000000 00000000 00000000 00000000
[ 12, 0] 0 0 89d79270 88b07318 b5e77a40-00000000
\FileSystem\FltMgr webshield
Args: 00000000 00000000 00000000 00000000
[ 12, 0] 0 0 87895e60 88b07318 00000000-00000000
\Driver\webshield
Args: 00000000 00000000 00000000 00000000
在这里可以看到相关的驱动了,不认识的有怀疑的驱动出现了。
查看有问题的设备可以用如下命令:
0: kd> !devobj 879bd718
Device object (879bd718) is for:
\FileSystem\Ntfs DriverObject 8b1c03e0
Current Irp 00000000 RefCount 0 Type 00000008 Flags 00000000
DevExt 879bd7d0 DevObjExt 879bdfd0
ExtensionFlags (0xc0000000) DOE_BOTTOM_OF_FDO_STACK, DOE_DESIGNATED_FDO
AttachedDevice (Upper) 89d79270 \FileSystem\FltMgr
Device queue is not busy.
如果再具体点,看进行的啥操作,可以:
0: kd> !irp 88b07318
IRP signature does not match, probably not an IRP
0: kd> dt nt!_file_object 88b07318
+0x000 Type : 5
+0x002 Size : 112
+0x004 DeviceObject : 0x8b1c7630 _DEVICE_OBJECT
+0x008 Vpb : 0x8b1c75a8 _VPB
+0x00c FsContext : 0xe31c2a70
+0x010 FsContext2 : 0xe31c2bb8
+0x014 SectionObjectPointer : 0x8aa60fec _SECTION_OBJECT_POINTERS
+0x018 PrivateCacheMap : (null)
+0x01c FinalStatus : 0
+0x020 RelatedFileObject : 0x896050f8 _FILE_OBJECT
+0x024 LockOperation : 0 ''
+0x025 DeletePending : 0 ''
+0x026 ReadAccess : 0 ''
+0x027 WriteAccess : 0x1 ''
+0x028 DeleteAccess : 0 ''
+0x029 SharedRead : 0x1 ''
+0x02a SharedWrite : 0 ''
+0x02b SharedDelete : 0 ''
+0x02c Flags : 0x44042
+0x030 FileName : _UNICODE_STRING "\www\addtest.txt"
+0x038 CurrentByteOffset : _LARGE_INTEGER 0x7
+0x040 Waiters : 0
+0x044 Busy : 1
+0x048 LastLock : (null)
+0x04c Lock : _KEVENT
+0x05c Event : _KEVENT
+0x06c CompletionContext : (null)
或者下面的命令也可以的。
0: kd> !fileobj 88b07318
\www\addtest.txt
Related File Object: 0x896050f8
Device Object: 0x8b1c7630 \Driver\Ftdisk
Vpb: 0x8b1c75a8
Access: Write SharedRead
Flags: 0x44042
Synchronous IO
Cache Supported
Cleanup Complete
Handle Created
File Object is currently busy and has 0 waiters.
FsContext: 0xe31c2a70 FsContext2: 0xe31c2bb8
CurrentByteOffset: 7
Cache Data:
Section Object Pointers: 8aa60fec
Shared Cache Map: 88e32810 File Offset: 7 in VACB number 0
Vacb: 8b192df0
Your data is at: c3c00007
至此,我的分析水平就到这里。
这应该是定位了,改如何解决呢,还得看代码,理思路。
解决问题要大胆,不畏权威。
感谢:
1.张银奎大师的帮助和指导。他的论坛:高端调试很好,很专业,就是没有看雪火,完善。
2.《windbg调试,dump分析》QQ群的管理人员等的回答,这里有好多大牛。
3.感谢公司给我这么长时间解决这个问题。
这是我2012年的看雪最后一贴。
发表此贴的目的是抛砖引玉。
made by correy
QQ:112426112
Email:kouleguan at hotmail dot com
Website:http://correy.webs.com
临时总结,不足之处,敬请指导。
[招生]科锐逆向工程师培训(2024年11月15日实地,远程教学同时开班, 第51期)