首页
社区
课程
招聘
MemoryModule x86x64 内存加载DLL
2022-3-29 18:31 31503

MemoryModule x86x64 内存加载DLL

2022-3-29 18:31
31503

MemoryModule x86x64 内存加载DLL

by devseed,此篇教程同时发在论坛和我的博客上,完整源码见我的github

0x0 前言

距离上次我写的SimpleDpack教程(为了理解加壳原理自己写的轻量级压缩壳),已经有一年多了。前几天修复x64IAT载入序号问题,于是想着可以用这个原理自己重写一个解析PE内存加载DLL的功能。虽然已有类似实现功能,如MemoryModule,但是这个项目还是有点复杂,模块间耦合度有些高,没法作为shellcode附加到EXE来引导DLL

 

因此我决定重写一个低耦合,同时支持x86x64,可以作为shellcode引导”附加到EXE区段上的DLL”的框架

0x1 MemoryModule 设计思路

(1) usage

本框架采取与windows加载DLL相似的名称来设计api,可以实现:

  • 将DLL按照内存对齐模式载入到内存中
  • 动态重定向DLL中的reloc地址,并载入IAT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
const char *dllpath = "test.dll";
size_t mempesize = 0;
void *memdll = NULL;
 
// load the pe file in memory and align it to memory align
void *mempe = winpe_memload_file(dllpath, &mempesize, TRUE);
 
// memory loadlibrary
memdll = winpe_memLoadLibrary(mempe);
winpe_memFreeLibrary(memdll);
 
// memory loadlibrary at specific address
size_t targetaddr = sizeof(size_t) > 4 ? 0x140030000: 0x90000;
memdll = winpe_memLoadLibraryEx(mempe, targetaddr,
    WINPE_LDFLAG_MEMALLOC, (PFN_LoadLibraryA)winpe_findloadlibrarya(),
    (PFN_GetProcAddress)winpe_memGetProcAddress);
winpe_memFreeLibrary(memdll);
free(mempe);
  • 将DLL注入到EXE的附加区段,并且用shellcode引导该DLL
1
win_injectmemdll.exe exepath dllpath [outpath]

(2) libwinpe.c

参考stbXXX_IMPLEMENTATION宏控制代码实现,使得include单个头文件就能把模块功能全部引入的做法,有着单头文件结成非常容易实用的特点。于是我就用了一周多时间,自己写了个类似的低耦合单头文件winpe.h,用于解析与加载PE。为了便于shellcode引导,系统函数大多用函数指针传入参数,关键函数全部声明为inline。另外,我还用内联汇编写了从ldr中获取kernel32的基地址,从而减少对IAT硬编码的依赖。

 

libwinpe.c, 与libwinpe.def(引入def是因为stdcall编译后会被修改名称),把winpe.h的声明的导出函数编译为DLL,如下:

1
2
3
#define WINPE_SHARED
#define WINPE_IMPLEMENTATION
#include "winpe.h"
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
EXPORTS
    winpe_appendsecth
    winpe_findgetprocaddress
    winpe_findkernel32
    winpe_findloadlibrarya
    winpe_findspace
    winpe_findmodulea
    winpe_imagebaseval
    winpe_imagesizeval
    winpe_memFreeLibrary
    winpe_memFreeLibraryEx
    winpe_memGetProcAddress
    winpe_memLoadLibrary
    winpe_memLoadLibraryEx
    winpe_membindiat
    winpe_membindtls
    winpe_memfindexp
    winpe_memfindiat
    winpe_memforwardexp
    winpe_memload
    winpe_memload_file
    winpe_memreloc
    winpe_noaslr
    winpe_oepval
    winpe_overlayload_file
    winpe_overlayoffset

(3) win_injectmemdll.c

对于编写shellcode, 这里为了方便调试,采取用pyhtonkeystone进行jit生成的机器码写入win_injectmemdll.c的全局变量里的做法, 如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
def inject_shellcodestubs(srcpath, libwinpepath, targetpath):
    pedll = lief.parse(libwinpepath)
    pedll_oph = pedll.optional_header
 
    # generate oepint shellcode
    if pedll_oph.magic == lief.PE.PE_TYPE.PE32_PLUS:
        oepinit_code = gen_oepinit_code64()
        pass
    elif pedll_oph.magic == lief.PE.PE_TYPE.PE32:
        oepinit_code = gen_oepinit_code32()
        pass
    else:
        print("error invalid pe magic!", pedll_oph.magic)
        return
 
    # find necessary functions
    FUNC_SIZE =0x400
    codes = {"winpe_memreloc": 0,
        "winpe_membindiat": 0,
        "winpe_membindtls": 0,
        "winpe_findloadlibrarya": 0,
        "winpe_findgetprocaddress": 0}
    for k in codes.keys():
        func = next(filter(lambda e : e.name == k,
            pedll.exported_functions))
        codes[k] = pedll.get_content_from_virtual_address(
            func.address, FUNC_SIZE)
    codes['winpe_oepinit'] = oepinit_code
 
    # write shellcode to c source file
    with open(srcpath, "rb") as fp:
        srctext = fp.read().decode('utf8')
    for k, v in codes.items():
        k = k.replace("winpe_", "")
        _codetext = ",".join([hex(x) for x in v])
        srctext = re.sub("g_" + k + r"_code(.+?)(\{0x90\})",
            "g_" + +  r"_code\1{" + _codetext +"}", srctext)
    with open(targetpath, "wb") as fp:
        fp.write(srctext.encode('utf8'))

详见win_injectmemdll_shellcodestub.py

0x2 内存加载DLL

(1) windows dll加载机制

参考https://bbs.pediy.com/thread-252260.htm, windows对于加载DLL调用如下:

a) 当调用LoadLibrary时首先打开文件并将按照PE文件格式解析数据,并以IMAGE方式将PE各段内容

逐块映射到内存中,如果需要并修复重定位:

kernel32!LoadlibraryA

->kernel32!LoadLibraryExA

​ ->kernel32!LoadLibraryExW

​ ->ntdll!LdrLoadDll

​ ->ntdll!LdrpLoadDll

​ ->ntdll!LdrpMapDll

​ ->ntdll!LdrpCreateDllSection

​ ->ntdll!NtOpenFile //打开目标文件*

​ ->ntdll!NtCreateSection //创建映射*

​ ->ntdll!NtMapViewOfSection //映射*

​ ->LdrRelocateImageWithBias //修复重定位

b) 接着处理调入表等,然后调用TLS回调函数和DLLMAN入口函数;

kernel32!LoadlibraryA

->kernel32!LoadLibraryExA

->kernel32!LoadLibraryExW

​ ->ntdll!LdrLoadDll

​ ->ntdll!LdrpLoadDll

​ ->ntdll!LdrpMapDll

​ ->ntdll!LdrpWalkImportDescriptor //处理导入表等

​ ->LdrpRunInitializeRoutines //调用DLLMAN

​ ->LdrpCallTlsInitializers

​ ->LdrpCallInitRoutine

 

总结为如下流程:

  1. 载入DLL到内存上,将DLL进行memory align对齐
  2. 通过reloc段重定向硬编码地址
  3. 逐一扫码IATOFT的模块和函数名,加载对于模块,将地址填入FT
  4. 调用DllMain函数(即DLL的OEP),fdwReason参数为DLL_PROCESS_ATTACH
  5. 初始化TLS, reasonDLL_PROCESS_ATTACH

(2) 模拟Windows加载DLL

.1 载入DLL到内存上,将DLL进行memory align对齐

这里为了降低耦合,没有用系统的memsetmemcpy,自己用inline实现的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
inline size_t STDCALL winpe_memload(
    const void *rawpe, size_t rawsize,
    void *mempe, size_t memsize,
    bool_t same_align)
{
    // load rawpe to memalign
    PIMAGE_DOS_HEADER pDosHeader = (PIMAGE_DOS_HEADER)rawpe;
    PIMAGE_NT_HEADERS  pNtHeader = (PIMAGE_NT_HEADERS)
        ((void*)rawpe + pDosHeader->e_lfanew);
    PIMAGE_FILE_HEADER pFileHeader = &pNtHeader->FileHeader;
    PIMAGE_OPTIONAL_HEADER pOptHeader = &pNtHeader->OptionalHeader;
    PIMAGE_SECTION_HEADER pSectHeader = (PIMAGE_SECTION_HEADER)
        ((void *)pOptHeader + pFileHeader->SizeOfOptionalHeader);
    WORD sectNum = pFileHeader->NumberOfSections;
    size_t imagesize = pOptHeader->SizeOfImage;
    if(!mempe) return imagesize;
    else if(memsize!=0 && memsize<imagesize) return 0;
 
    _inl_memset(mempe, 0, imagesize);
    _inl_memcpy(mempe, rawpe, pOptHeader->SizeOfHeaders);
 
    for(WORD i=0;i<sectNum;i++)
    {
        _inl_memcpy(mempe+pSectHeader[i].VirtualAddress,
            rawpe+pSectHeader[i].PointerToRawData,
            pSectHeader[i].SizeOfRawData);
    }
 
    // adjust all to mem align
    if(same_align)
    {
        pDosHeader = (PIMAGE_DOS_HEADER)mempe;
        pNtHeader = (PIMAGE_NT_HEADERS)((void*)mempe + pDosHeader->e_lfanew);
        pFileHeader = &pNtHeader->FileHeader;
        pOptHeader = &pNtHeader->OptionalHeader;
        pSectHeader = (PIMAGE_SECTION_HEADER)
            ((void *)pOptHeader + pFileHeader->SizeOfOptionalHeader);
        sectNum = pFileHeader->NumberOfSections;
 
        pOptHeader->FileAlignment = pOptHeader->SectionAlignment;
 
        for(WORD i=0;i<sectNum;i++)
        {
            pSectHeader[i].PointerToRawData = pSectHeader[i].VirtualAddress;
        }
    }
    return imagesize;
}

.2 通过reloc段重定向硬编码地址

手动将以imagebase为基准的va重定向到目标imagebaseva

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
inline size_t STDCALL winpe_memreloc(
    void *mempe, size_t newimagebase)
{
    PIMAGE_DOS_HEADER pDosHeader = (PIMAGE_DOS_HEADER)mempe;
    PIMAGE_NT_HEADERS  pNtHeader = (PIMAGE_NT_HEADERS)
        ((void*)mempe + pDosHeader->e_lfanew);
    PIMAGE_FILE_HEADER pFileHeader = &pNtHeader->FileHeader;
    PIMAGE_OPTIONAL_HEADER pOptHeader = &pNtHeader->OptionalHeader;
    PIMAGE_DATA_DIRECTORY pDataDirectory = pOptHeader->DataDirectory;
    PIMAGE_DATA_DIRECTORY pRelocEntry = &pDataDirectory[IMAGE_DIRECTORY_ENTRY_BASERELOC];
 
    DWORD reloc_count = 0;
    DWORD reloc_offset = 0;
    int64_t shift = (int64_t)newimagebase -
        (int64_t)pOptHeader->ImageBase;
    while (reloc_offset < pRelocEntry->Size)
    {
        PIMAGE_BASE_RELOCATION pBaseReloc = (PIMAGE_BASE_RELOCATION)
            ((void*)mempe + pRelocEntry->VirtualAddress + reloc_offset);
        PRELOCOFFSET pRelocOffset = (PRELOCOFFSET)((void*)pBaseReloc
            + sizeof(IMAGE_BASE_RELOCATION));
        // RELOCOFFSET block num
        DWORD item_num = (pBaseReloc->SizeOfBlock -
            sizeof(IMAGE_BASE_RELOCATION)) / sizeof(RELOCOFFSET);
        for (size_t i = 0; i < item_num; i++)
        {
            if (!pRelocOffset[i].type &&
                !pRelocOffset[i].offset) continue;
            DWORD targetoffset = pBaseReloc->VirtualAddress +
                    pRelocOffset[i].offset;
            size_t *paddr = (size_t *)((void*)mempe + targetoffset);
            size_t relocaddr = (size_t)((int64_t)*paddr + shift);
            *paddr = relocaddr;
        }
        reloc_offset += sizeof(IMAGE_BASE_RELOCATION) +
            sizeof(RELOCOFFSET) * item_num;
        reloc_count += item_num;
    }
    pOptHeader->ImageBase = newimagebase;
    return reloc_count;
}

.3 逐一扫码IATOFT的模块和函数名,加载对于模块,将地址填入FT

LoadLibraryAGetProcAddress实现,考虑到shellcode,不能直接调用,应该通过传入函数指针调用。

 

另外还要考虑OFT中会出现用序号的情况,x86x64判断表中不同,用sizeof(size_t)区分。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
WINPEDEF WINPE_EXPORT
inline size_t STDCALL winpe_membindiat(void *mempe,
    PFN_LoadLibraryA pfnLoadLibraryA,
    PFN_GetProcAddress pfnGetProcAddress)
{
    PIMAGE_DOS_HEADER pDosHeader = (PIMAGE_DOS_HEADER)mempe;
    PIMAGE_NT_HEADERS  pNtHeader = (PIMAGE_NT_HEADERS)
        ((void*)mempe + pDosHeader->e_lfanew);
    PIMAGE_FILE_HEADER pFileHeader = &pNtHeader->FileHeader;
    PIMAGE_OPTIONAL_HEADER pOptHeader = &pNtHeader->OptionalHeader;
    PIMAGE_DATA_DIRECTORY pDataDirectory = pOptHeader->DataDirectory;
    PIMAGE_DATA_DIRECTORY pImpEntry = 
        &pDataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT];
    PIMAGE_IMPORT_DESCRIPTOR pImpDescriptor = 
        (PIMAGE_IMPORT_DESCRIPTOR)(mempe + pImpEntry->VirtualAddress);
 
    PIMAGE_THUNK_DATA pFtThunk = NULL;
    PIMAGE_THUNK_DATA pOftThunk = NULL;
    LPCSTR pDllName = NULL;
    PIMAGE_IMPORT_BY_NAME pImpByName = NULL;
    size_t funcva = 0;
    char *funcname = NULL;
 
    // origin GetProcAddress will crash at InitializeSListHead
    if(!pfnLoadLibraryA) pfnLoadLibraryA =
        (PFN_LoadLibraryA)winpe_findloadlibrarya();
    if(!pfnGetProcAddress) pfnGetProcAddress =
        (PFN_GetProcAddress)winpe_findgetprocaddress();
 
    DWORD iat_count = 0;
    for (; pImpDescriptor->Name; pImpDescriptor++)
    {
        pDllName = (LPCSTR)(mempe + pImpDescriptor->Name);
        pFtThunk = (PIMAGE_THUNK_DATA)
            (mempe + pImpDescriptor->FirstThunk);
        pOftThunk = (PIMAGE_THUNK_DATA)
            (mempe + pImpDescriptor->OriginalFirstThunk);
        size_t dllbase = (size_t)pfnLoadLibraryA(pDllName);
        if(!dllbase) return 0;
 
        for (int j=0; pFtThunk[j].u1.Function
            &&  pOftThunk[j].u1.Function; j++)
        {
            size_t _addr = (size_t)(mempe + pOftThunk[j].u1.AddressOfData);
            if(sizeof(size_t)>4) // x64
            {
                if((_addr>>63) == 1)
                {
                    funcname = (char *)(_addr & 0xffff);
                }
                else
                {
                    pImpByName=(PIMAGE_IMPORT_BY_NAME)_addr;
                    funcname = pImpByName->Name;
                }
            }
            else
            {
                if(((size_t)pImpByName>>31) == 1)
                {
                    funcname = (char *)(_addr & 0xffff);
                }
                else
                {
                    pImpByName=(PIMAGE_IMPORT_BY_NAME)_addr;
                    funcname = pImpByName->Name;
                }
            }
 
            funcva = (size_t)pfnGetProcAddress(
                (HMODULE)dllbase, funcname);
            if(!funcva) continue;
            pFtThunk[j].u1.Function = funcva;
            assert(funcva == (size_t)GetProcAddress(
                (HMODULE)dllbase, funcname));
            iat_count++;
        }
    }
    return iat_count;
}

.4 调用DllMain函数(即DLL的OEP),fdwReason参数为DLL_PROCESS_ATTACH

前三步都完成后可以来通过DllMain来引导了。

1
2
3
4
5
6
7
8
9
10
// initial memory module
if(!winpe_memreloc((void*)imagebase, imagebase))
    return NULL;
if(!winpe_membindiat((void*)imagebase,
    pfnLoadLibraryA, pfnGetProcAddress)) return NULL;
winpe_membindtls(mempe, DLL_PROCESS_ATTACH);
PFN_DllMain pfnDllMain = (PFN_DllMain)
    (imagebase + winpe_oepval((void*)imagebase, 0));
pfnDllMain((HINSTANCE)imagebase, DLL_PROCESS_ATTACH, NULL);
return (void*)imagebase;

.5 初始化TLS, reasonDLL_PROCESS_ATTACH

这部分我没有找到有TLS的DLL,所以没法进一步测试了。这部分代码直接借鉴的MemoryModule的相关函数。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
inline size_t STDCALL winpe_membindtls(void *mempe, DWORD resaon)
{
    PIMAGE_DOS_HEADER pDosHeader = (PIMAGE_DOS_HEADER)mempe;
    PIMAGE_NT_HEADERS  pNtHeader = (PIMAGE_NT_HEADERS)
        ((void*)mempe + pDosHeader->e_lfanew);
    PIMAGE_FILE_HEADER pFileHeader = &pNtHeader->FileHeader;
    PIMAGE_OPTIONAL_HEADER pOptHeader = &pNtHeader->OptionalHeader;
    PIMAGE_DATA_DIRECTORY pDataDirectory = pOptHeader->DataDirectory;
    PIMAGE_DATA_DIRECTORY pTlsDirectory =
        &pDataDirectory[IMAGE_DIRECTORY_ENTRY_TLS];
    if(!pTlsDirectory->VirtualAddress) return 0;
 
    size_t tls_count = 0;
    PIMAGE_TLS_DIRECTORY pTlsEntry = (PIMAGE_TLS_DIRECTORY)
        (mempe + pTlsDirectory->VirtualAddress);
    PIMAGE_TLS_CALLBACK *tlscb= (PIMAGE_TLS_CALLBACK*)
        pTlsEntry->AddressOfCallBacks;
    if(tlscb)
    {
        while(*tlscb)
        {
            (*tlscb)(mempe, reason, NULL);
            tlscb++;
            tls_count++;
        }
    }
    return tls_count;
}

0x3 DLL附加到EXE区段与shellcode编写

(1) EXE区段扩容

首先需要修改exe的区段头,在后面增加一个区段。这里要特别注意区段的对齐,之后还要修正SizeOfImage

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
inline size_t STDCALL winpe_appendsecth(void *pe,
    PIMAGE_SECTION_HEADER psecth)
{
    PIMAGE_DOS_HEADER pDosHeader = (PIMAGE_DOS_HEADER)pe;
    PIMAGE_NT_HEADERS  pNtHeader = (PIMAGE_NT_HEADERS)
        ((void*)pe + pDosHeader->e_lfanew);
    PIMAGE_FILE_HEADER pFileHeader = &pNtHeader->FileHeader;
    PIMAGE_OPTIONAL_HEADER pOptHeader = &pNtHeader->OptionalHeader;
    PIMAGE_SECTION_HEADER pSectHeader = (PIMAGE_SECTION_HEADER)
        ((void*)pOptHeader + pFileHeader->SizeOfOptionalHeader);
    WORD sectNum = pFileHeader->NumberOfSections;
    PIMAGE_SECTION_HEADER pLastSectHeader = &pSectHeader[sectNum-1];
    DWORD addr, align;
 
    // check the space to append section
    if(pFileHeader->SizeOfOptionalHeader
        + sizeof(IMAGE_SECTION_HEADER)
     > pSectHeader[0].PointerToRawData) return 0;
 
    // fill rva addr
    align = pOptHeader->SectionAlignment;
    addr = pLastSectHeader->VirtualAddress + pLastSectHeader->Misc.VirtualSize;
    if(addr % align) addr += align - addr%align;
    psecth->VirtualAddress = addr;
 
    // fill file offset
    align = pOptHeader->FileAlignment;
    addr =  pLastSectHeader->PointerToRawData+ pLastSectHeader->SizeOfRawData;
    if(addr % align) addr += align - addr%align;
    psecth->PointerToRawData = addr;
 
    // adjust the section and imagesize
    pFileHeader->NumberOfSections++;
    _inl_memcpy(&pSectHeader[sectNum], psecth, sizeof(IMAGE_SECTION_HEADER));
    align = pOptHeader->SectionAlignment;
    addr = psecth->VirtualAddress + psecth->Misc.VirtualSize;
    if(addr % align) addr += align - addr%align;
    pOptHeader->SizeOfImage = addr;
    return pOptHeader->SizeOfImage;
}

(2) 引导DLL的shellcode编写

这部分内容和winpe_memLoadLibrary函数实现的方式类似,只不过要用汇编来写shellcode,引导完成后跳转到原来exe的oep。 关键地址可以先留空,在注入时候填入对位置。对于相应的函数的shellcode,由win_injectmemdll_shellcodestub.py读取对应的libpe32.dll函数,填入对应的全局变量。下面代码以x86为例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
def gen_oepinit_code32():
    ks = Ks(KS_ARCH_X86, KS_MODE_32)
    code_str = f"""
    // for relative address, get the base of addr
    call getip;
    lea ebx, [eax-5];
 
    // get the imagebase
    mov eax, 0x30; // to avoid relative addressing
    mov edi, dword ptr fs:[eax]; //peb
    mov edi, [edi + 0ch]; //ldr
    mov edi, [edi + 14h]; //InMemoryOrderLoadList, this
    mov edi, [edi -8h + 18h]; //this.DllBase
 
    // get loadlibrarya, getprocaddress
    mov eax, [ebx + findloadlibrarya];
    add eax, edi;
    call eax;
    mov [ebx + findloadlibrarya], eax;
    mov eax, [ebx + findgetprocaddress];
    add eax, edi;
    call eax;
    mov [ebx + findgetprocaddress], eax;
 
    // reloc
    mov eax, [ebx + dllrva];
    add eax, edi;
    push eax;
    push eax;
    mov eax, [ebx + memrelocrva];
    add eax, edi;
    call eax;
 
    // bind iat
    mov eax, [ebx + findgetprocaddress];
    push eax; // arg3, getprocaddress
    mov eax, [ebx + findloadlibrarya];
    push eax; // arg2, loadlibraryas
    mov eax, [ebx + dllrva];
    add eax, edi;
    push eax; // arg1, dllbase value
    mov eax, [ebx + membindiatrva];
    add eax, edi
    call eax;
 
    // bind tls
    xor eax, eax;
    inc eax;
    push eax; // arg2, reason for tls
    mov eax, [ebx + dllrva]
    add eax, edi;
    push eax; // arg1, dllbase
    mov eax, [ebx + membindtlsrva];
    add eax, edi;
    call eax;
 
    // call dll oep, for dll entry
    xor eax, eax;
    push eax; // lpvReserved
    inc eax;
    push eax; // fdwReason, DLL_PROCESS_ATTACH
    mov eax, [ebx + dllrva];
    add eax, edi;
    push eax; // hinstDLL
    mov eax, [ebx + dlloeprva];
    add eax, edi;
    call eax;
 
    // jmp to origin oep
    mov eax, [ebx+exeoeprva];
    add eax, edi;
    jmp eax;
    getip:
    mov eax, [esp]
    ret
    exeoeprva: nop;nop;nop;nop;
    dllrva: nop;nop;nop;nop;
    dlloeprva: nop;nop;nop;nop;
    memrelocrva: nop;nop;nop;nop;
    membindiatrva: nop;nop;nop;nop;
    membindtlsrva: nop;nop;nop;nop;
    findloadlibrarya: nop;nop;nop;nop;
    findgetprocaddress: nop;nop;nop;nop;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
// these functions are stub function, will be filled by python
#define FUNC_SIZE 0x400
#define SHELLCODE_SIZE 0X2000
unsigned char g_oepinit_code[] = {0x90};
unsigned char g_memreloc_code[] = {0x90};
unsigned char g_membindiat_code[] = {0x90};
unsigned char g_membindtls_code[] = {0x90};
unsigned char g_findloadlibrarya_code[] = {0x90};
unsigned char g_findgetprocaddress_code[] = {0x90};
 
void _makeoepcode(void *shellcode,
    size_t shellcoderva, size_t dllrva,
    DWORD orgexeoeprva, DWORD orgdlloeprva)
{
    // bind the pointer to buffer
    size_t oepinit_end = sizeof(g_oepinit_code);
    size_t memreloc_start = FUNC_SIZE;
    size_t membindiat_start = memreloc_start + FUNC_SIZE;
    size_t membindtls_start = membindiat_start + FUNC_SIZE;
    size_t findloadlibrarya_start = membindtls_start + FUNC_SIZE;
    size_t findgetprocaddress_start = findloadlibrarya_start + FUNC_SIZE;
 
     // fill the address table
    size_t *pexeoeprva = (size_t*)(g_oepinit_code + oepinit_end - 8*sizeof(size_t));
    size_t *pdllbrva = (size_t*)(g_oepinit_code + oepinit_end - 7*sizeof(size_t));
    size_t *pdlloeprva = (size_t*)(g_oepinit_code + oepinit_end - 6*sizeof(size_t));
    size_t *pmemrelocrva = (size_t*)(g_oepinit_code + oepinit_end - 5*sizeof(size_t));
    size_t *pmembindiatrva = (size_t*)(g_oepinit_code + oepinit_end - 4*sizeof(size_t));
    size_t *pmembindtlsrva = (size_t*)(g_oepinit_code + oepinit_end - 3*sizeof(size_t));
    size_t *pfindloadlibrarya = (size_t*)(g_oepinit_code + oepinit_end - 2*sizeof(size_t));
    size_t *pfindgetprocaddress = (size_t*)(g_oepinit_code + oepinit_end - 1*sizeof(size_t));
    *pexeoeprva = orgexeoeprva;
    *pdllbrva =  dllrva;
    *pdlloeprva = dllrva + orgdlloeprva;
    *pmemrelocrva = shellcoderva + memreloc_start;
    *pmembindiatrva = shellcoderva + membindiat_start;
    *pmembindtlsrva = shellcoderva + membindtls_start;
    *pfindloadlibrarya = shellcoderva + findloadlibrarya_start;
    *pfindgetprocaddress = shellcoderva + findgetprocaddress_start;
 
    // copy to the target
    memcpy(shellcode ,
        g_oepinit_code, sizeof(g_oepinit_code));
    memcpy(shellcode + memreloc_start,
        g_memreloc_code, sizeof(g_memreloc_code));
    memcpy(shellcode + membindiat_start,
        g_membindiat_code, sizeof(g_membindiat_code));
    memcpy(shellcode + membindtls_start,
        g_membindtls_code, sizeof(g_membindtls_code));
    memcpy(shellcode + findloadlibrarya_start,
        g_findloadlibrarya_code, sizeof(g_findloadlibrarya_code));
    memcpy(shellcode + findgetprocaddress_start,
        g_findgetprocaddress_code, sizeof(g_findgetprocaddress_code));
}

0x4 技巧tips

(1) x64适配

由于大部分都用的size_tvoid*等与cpu字长相关的变量,大部分代码不用修改就能编译。

 

其余的差异部分,可以用sizeof(size_t)>4区分,或者_WIN64宏。如下所示

1
2
3
4
5
6
7
8
9
10
// append section header to exe
size_t align = sizeof(size_t) > 4 ? 0x10000: 0x1000;
size_t padding = _sectpaddingsize(mempe_exe, mempe_dll, align);
secth.Characteristics = IMAGE_SCN_MEM_READ |
    IMAGE_SCN_MEM_WRITE | IMAGE_SCN_MEM_EXECUTE;
secth.Misc.VirtualSize = SHELLCODE_SIZE + padding + mempe_dllsize;
secth.SizeOfRawData = SHELLCODE_SIZE + padding + mempe_dllsize;
strcpy((char*)secth.Name, ".module");
winpe_noaslr(mempe_exe);
winpe_appendsecth(mempe_exe, &secth);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
inline void* winpe_findkernel32()
{
    // return (void*)LoadLibrary("kernel32.dll");
    // TEB->PEB->Ldr->InMemoryOrderLoadList->curProgram->ntdll->kernel32
    void *kerenl32 = NULL;
#ifdef _WIN64
    __asm{
        mov rax, gs:[60h]; peb
        mov rax, [rax+18h]; ldr
        mov rax, [rax+20h]; InMemoryOrderLoadList, currentProgramEntry
        mov rax, [rax]; ntdllEntry, currentProgramEntry->->Flink
        mov rax, [rax]; kernel32Entry,  ntdllEntry->Flink
        mov rax, [rax-10h+30h]; kernel32.DllBase
        mov kerenl32, rax;
    }
#else
    __asm{
        mov eax, fs:[30h]; peb
        mov eax, [eax+0ch]; ldr
        mov eax, [eax+14h]; InMemoryOrderLoadList, currentProgramEntry
        mov eax, [eax]; ntdllEntry, currentProgramEntry->->Flink
        mov eax, [eax]; kernel32Entry,  ntdllEntry->Flink
        mov eax, [eax - 8h +18h]; kernel32.DllBase
        mov kerenl32, eax;
    }
#endif
    return kerenl32;
}

(2) 扫描export找到LoadLibrary和GetProcAddress

上面的代码我们可以获取kernel32的基址了,之后可以自己通过扫描export表来查找对应的导出函数了。

 

这里注意有个坑点,有的导出表项会forward到其他dll上,如kernel32!InitializeSListHead -> NTDLL!RtlInitializeSListHead,详见crashing-kernel32-initializeslisthead。通过DirRVA < FuncRVA < DirRVA + ExportSize来判断是否forward。

 

此处为了找到LoadLibraryAGetProcAddress函数,不考虑forward情况,否则会引入递归问题。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
inline void* STDCALL winpe_memfindexp(
    void *mempe, LPCSTR funcname)
{
    PIMAGE_DOS_HEADER pDosHeader = (PIMAGE_DOS_HEADER)mempe;
    PIMAGE_NT_HEADERS  pNtHeader = (PIMAGE_NT_HEADERS)
        ((void*)mempe + pDosHeader->e_lfanew);
    PIMAGE_FILE_HEADER pFileHeader = &pNtHeader->FileHeader;
    PIMAGE_OPTIONAL_HEADER pOptHeader = &pNtHeader->OptionalHeader;
    PIMAGE_DATA_DIRECTORY pDataDirectory = pOptHeader->DataDirectory;
    PIMAGE_DATA_DIRECTORY pExpEntry = 
        &pDataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT];
    PIMAGE_EXPORT_DIRECTORY  pExpDescriptor = 
        (PIMAGE_EXPORT_DIRECTORY)(mempe + pExpEntry->VirtualAddress);
 
    WORD *ordrva = mempe + pExpDescriptor->AddressOfNameOrdinals;
    DWORD *namerva = mempe + pExpDescriptor->AddressOfNames;
    DWORD *funcrva = mempe + pExpDescriptor->AddressOfFunctions;
    if((size_t)funcname <= MAXWORD) // find by ordnial
    {
        WORD ordbase = LOWORD(pExpDescriptor->Base) - 1;
        WORD funcord = LOWORD(funcname);
        return (void*)(mempe + funcrva[ordrva[funcord-ordbase]]);
    }
    else
    {
        for(int i=0;i<pExpDescriptor->NumberOfNames;i++)
        {
            LPCSTR curname = (LPCSTR)(mempe+namerva[i]);
            if(_inl_stricmp(curname, funcname)==0)
            {
                return (void*)(mempe + funcrva[ordrva[i]]);
            }      
        }
    }
    return NULL;
}
 
inline PROC winpe_findloadlibrarya()
{
    // return (PROC)LoadLibraryA;
    HMODULE hmod_kernel32 = (HMODULE)winpe_findkernel32();
    char name_LoadLibraryA[] = {'L', 'o', 'a', 'd', 'L', 'i', 'b', 'r', 'a', 'r', 'y', 'A', '\0'};
    return (PROC)winpe_memfindexp( // suppose exp no forward, to avoid recursive
        (void*)hmod_kernel32, name_LoadLibraryA);
}
 
inline PROC winpe_findgetprocaddress()
{
    // return (PROC)GetProcAddress;
    HMODULE hmod_kernel32 = (HMODULE)winpe_findkernel32();
    char name_GetProcAddress[] = {'G', 'e', 't', 'P', 'r', 'o', 'c', 'A', 'd', 'd', 'r', 'e', 's', 's', '\0'};
    return (PROC)winpe_memfindexp(hmod_kernel32, name_GetProcAddress);
}

0x5 后记

本项目是我实现不加壳vfs(即通过hook nt函数将文件和文件夹等读取重定向到zip文件中)的衍生产物,采取的是尽量单头文件法和内联函数减小耦合,方便集成到代码中的做法。与其他的MemoryModule最大的区别是支持从shellcode引导DLL。由于以前我有了自己写压缩壳的经验,整体上处理起来还算是顺利。强迫症,感觉好多时间都花在了怎么统一化命名api和代码风格规范上了。至于实现内存加载Resource方面,暂时不考虑实现,汉化游戏一般也用不到这方面内容。

 

另外,这里还有个坑卡了我好长时间,就是x64的堆栈必须要0x10对齐(即调用函数时要sub rsp, 0x28预留空间与对齐),才能在movaps指令上不报错。之前一直不知道这条指令需要对齐,一直以为是少了哪些初始化,走偏了好久。特别感谢@YeLikERs指出了这一点,同时在PEBLDR相关内容上帮助了很多。

 

经测试兼容性还算不错,windows xp, windows 7, windows 10, windows 11测试example文件都通过了,甚至我连linux上的wine也测试了(即用我汉化的游戏测试shellcode引导附加到exe区段上的DLL)。最后,附上几张测试图:

 

memdll_x86winetest
memdll_x64test


[CTF入门培训]顶尖高校博士及硕士团队亲授《30小时教你玩转CTF》,视频+靶场+题目!助力进入CTF世界

收藏
点赞19
打赏
分享
最新回复 (30)
雪    币: 7300
活跃值: (3758)
能力值: (RANK:1130 )
在线值:
发帖
回帖
粉丝
海风月影 22 2022-3-29 21:24
2
0
x64里面的dll,支持异常处理嘛?
雪    币: 1641
活跃值: (3601)
能力值: (RANK:15 )
在线值:
发帖
回帖
粉丝
はつゆき 2022-3-29 21:39
3
0
海风月影 x64里面的dll,支持异常处理嘛?
按帖子内容来看并没有,应该追求的基本就是能用就ok
雪    币: 6296
活跃值: (4927)
能力值: ( LV12,RANK:250 )
在线值:
发帖
回帖
粉丝
devseed 4 2022-3-29 22:35
4
0
海风月影 x64里面的dll,支持异常处理嘛?
seh异常处理我没试过,因为主要目标是为了汉化直接把dll付在exe附加段上,你可以试试seh有没有问题。
雪    币: 6296
活跃值: (4927)
能力值: ( LV12,RANK:250 )
在线值:
发帖
回帖
粉丝
devseed 4 2022-3-29 22:36
5
0
はつゆき 按帖子内容来看并没有,应该追求的基本就是能用就ok
我也不太清楚异常是不是需要特殊处理。就我自己的测试,64位yuzu模拟器的exe,把dll附加到exe程序里测试通过。
雪    币: 6648
活跃值: (846)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
kingswb 2022-3-30 06:30
6
0
学习占位
雪    币: 6111
活跃值: (3936)
能力值: ( LV6,RANK:80 )
在线值:
发帖
回帖
粉丝
黑洛 1 2022-3-30 09:08
7
0
汉化竟然也这么卷了吗
雪    币: 1641
活跃值: (3601)
能力值: (RANK:15 )
在线值:
发帖
回帖
粉丝
はつゆき 2022-3-30 09:35
8
0
devseed 我也不太清楚异常是不是需要特殊处理。就我自己的测试,64位yuzu模拟器的exe,把dll附加到exe程序里测试通过。
异常是需要额外处理的,不过相比之下,资源的处理才是最麻烦的
雪    币: 6189
活跃值: (3251)
能力值: ( LV5,RANK:69 )
在线值:
发帖
回帖
粉丝
小菜鸟一 2022-3-30 11:21
9
0
汉化都这么高级了,666
雪    币: 4
能力值: ( LV1,RANK:0 )
在线值:
发帖
回帖
粉丝
mb_hbzbnhpp 2022-3-30 12:01
10
0
博主优秀啊
雪    币: 576
活跃值: (2035)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
kakasasa 2022-3-30 12:58
11
0
开源的貌似就MemoryModulePP最全,支持异常和线程局部存储,但是代码不简洁。不管线程局部存储的话,改改MemoryModule原版处理下异常就好。
雪    币: 1040
活跃值: (455)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
Rookietp 2022-3-30 13:01
12
0
那个py不太会用 没环境
雪    币: 310
活跃值: (1887)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
niuzuoquan 2022-3-30 13:53
13
0
mark
雪    币: 6296
活跃值: (4927)
能力值: ( LV12,RANK:250 )
在线值:
发帖
回帖
粉丝
devseed 4 2022-3-30 14:52
14
0
Rookietp 那个py不太会用 没环境[em_5]
py只是为了方便生成shellcode,只是内存加载dll,其实只用winpe.h就行了。
雪    币: 6296
活跃值: (4927)
能力值: ( LV12,RANK:250 )
在线值:
发帖
回帖
粉丝
devseed 4 2022-3-30 14:55
15
0
kakasasa 开源的貌似就MemoryModulePP最全,支持异常和线程局部存储,但是代码不简洁。不管线程局部存储的话,改改MemoryModule原版处理下异常就好。
是的,我一开始也想改github上的MemoryModule,但是发现他里面好多直接用系统api调用的,想改成shellcode引导不太方便。
雪    币: 1382
活跃值: (3100)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
小希希 2022-3-31 12:48
16
0
不错不错

这里注意有个坑点,有的导出表项会forward到其他dll上,如kernel32!InitializeSListHead -> 
NTDLL!RtlInitializeSListHead,详见crashing-kernel32-initializeslisthead。通过DirRVA < FuncRVA < DirRVA + ExportSize来判断是否forward。

这种 forward 很讨厌,曾经也被坑过...



雪    币: 1525
活跃值: (1228)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
MochaCo 2022-3-31 13:54
17
0
内存加载MFC的DLL好像还需要特殊处理一下吧,要挂钩几个API
雪    币: 2640
活跃值: (3569)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
值得怀疑 2022-4-4 12:08
18
0
msvc 报错啊··········
雪    币: 2640
活跃值: (3569)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
值得怀疑 2022-4-4 12:28
19
0
1>\winpe.h(708,76): error C2036: “void *”: 未知的大小
1>\winpe.h(712,68): error C2036: “void *”: 未知的大小
1>\winpe.h(731,45): error C2036: “void *”: 未知的大小
1>\winpe.h(744,57): error C2036: “void *”: 未知的大小
winpe.h(775,45): error C2036: “void *”: 未知的大小

很多这个···
雪    币: 23
能力值: ( LV1,RANK:0 )
在线值:
发帖
回帖
粉丝
大魔法狮子 2022-4-6 11:45
20
0

    说到降低耦合,嘿嘿,我可有精神了

    只需要继承CMappingPe类,并覆盖protect里的几个虚函数,即可在r3 or r0使用。不过跨进程不会选在R3。

    话说R0 怎么处理seh啊。

雪    币: 6296
活跃值: (4927)
能力值: ( LV12,RANK:250 )
在线值:
发帖
回帖
粉丝
devseed 4 2022-4-9 17:54
21
0
大魔法狮子 &nbsp;&nbsp;&nbsp;&nbsp;说到降低耦合,嘿嘿,我可有精神了&nbsp;&nbsp;&nbsp;&nbsp;只需要继 ...
一般来说内核态很少用c++开发吧? 用c++有个麻烦事,动态LoadLibrary的符号问题不好解决。seh我也想知道怎么处理的,没有遇到过对应的样本。
雪    币: 6296
活跃值: (4927)
能力值: ( LV12,RANK:250 )
在线值:
发帖
回帖
粉丝
devseed 4 2022-4-9 18:02
22
0
值得怀疑 1>\winpe.h(708,76): error C2036: “void *”: 未知的大小 1>\winpe.h(712,68): error C2036: “void *”: 未 ...
我开发是用clang编译的,还没试过msvc,是不是msvc不支持void*指针相加?
雪    币: 23
能力值: ( LV1,RANK:0 )
在线值:
发帖
回帖
粉丝
大魔法狮子 2022-4-14 10:50
23
0
devseed 一般来说内核态很少用c++开发吧? 用c++有个麻烦事,动态LoadLibrary的符号问题不好解决。seh我也想知道怎么处理的,没有遇到过对应的样本。
这简单,同样内存加载一份,计算rva,然后取模块地址。我说的是顺便r0加载驱动。
雪    币: 23
能力值: ( LV1,RANK:0 )
在线值:
发帖
回帖
粉丝
大魔法狮子 2022-4-14 10:51
24
0
devseed 我开发是用clang编译的,还没试过msvc,是不是msvc不支持void*指针相加?
支持吧
雪    币: 10012
活跃值: (1649)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
wzmooo 2022-10-22 17:11
25
0
py没有这个环境 能不能用C实现下 生成shellcode 谢谢
游客
登录 | 注册 方可回帖
返回