首页
社区
课程
招聘
[原创]海莲花glitch样本去混淆
发表于: 2022-4-14 21:48 14117

[原创]海莲花glitch样本去混淆

2022-4-14 21:48
14117

  奇安信的报告《使用和海莲花相似混淆手法的攻击样本分析》[1]中分析了一个和APT32使用相同混淆方法的样本。本文根据奇安信的报告以及报告中提到的参考文章和代码[2]对该样本进行去混淆。

SHA256:bf3e495f43a6b333b10ae69667304cfd2c87e9100de9d31365671c7b6b93132e

  如下图所示,cmp/test指令将数据段中存储的数据与立即数进行比较,下一条指令是条件跳转指令,根据比较结果来决定是否发生跳转。恶意代码通过这种方式来混淆控制流,影响分析人员进行逆向分析。


图 1-1 混淆代码

  解决方法:

  该dll当前载入的基址是0x10000000,与立即数(operation_2)进行比较的数据存储在地址operation_1(0x1007EE93),如果dll载入基址发生变化,则operation_1也会相应发生变化。为了使dll在基址发生改变时也能正确获取数据,使用dword_1007EE93的地址被记录在了重定位表中,当基址发生变化时,程序会根据重定位表中的地址修改operation_1。重定位表中保存了一大堆需要修正的代码的地址。

  我们可以获取重定位表中存储的地址信息,通过判断该地址前面的指令是否为cmp/test来确定混淆指令的地址。模拟执行cmp/test、跳转指令后获取之后执行的指令地址。通过判断是否发生了跳转,将对应指令替换成NOP。

  使用python的pefile库获取重定位表中存储的RVA。


图 2-1 重定位表

  重定位表由数个IMAGE_BASE_RELOCATION结构组成,每个结构由VirutalAddress(DWORD)、SizeOfBlock(DWORD)和TypeOffset(SizeOfBlock-8)组成。重定位数据2个字节一组,高4位是类型,低12位是地址。低12位加上VirutalAddress是RVA。以第一个数据0x3031为例,低12位是0x031,加上0x1000,是0x1031。


  parse_relocations_directory返回BaseRelocationData对象列表。BaseRelocationData有两个属性,struct和entries。struct是IMAGE_BASE_RELOCATION结构的VA和Size。entries是RelocationData对象列表,每一个RelocationData包含type和RVA,RVA是低12位加上VirutalAddress后的值。


图 2-2 BaseRelocationData的struct和RelocationData对象列表


图 2-3 重定位数据的RVA

reloc_data_rva列表中存储所有重定位数据RVA。

  本部分代码引用自

https://github.com/levanvn/APT32_Deobfuscate/blob/master/Type2/Script/Type2_Deobfuscate.py

  混淆指令有以下5种情况。





图 2-4 混淆指令

  在默认操作数是32 位的OS 上,任何操作word 的指令都较操作dword 的指令长一个字节(Prefixes 0x66)。操作数前面的机器码长度是2到3字节。从使用重定位数据的地址往前3个字节或2个字节进行汇编,判断指令是否为cmp/test和跳转指令,如果是就获取到了混淆指令所在地址。

  设置初值b为3,获取数据,如果往前3个字节开头是0x66,b减1,判断汇编代码,符合条件返回地址, reloc_data_rva - b – 1。如果开头不是0x66,b减1,判断往前2个字节的汇编代码,符合条件返回地址。

  PE文件头中的FileAlignment定义了磁盘区块的对齐值,SectionAlignment定义了内存中区块的内存值。每一个区块从对齐值的倍数的偏移位置开始。

图 2-5 对齐值


图 2-6 文件映射到内存中的地址

  pefile库中的函数get_memory_mapped_image()可以返回与PE文件的内存布局对应的数据

  获取到混淆指令所在地址后,模拟执行3条指令,cmp/test、跳转指令和第3条指令,记录第3条指令的地址。

  使用count对执行的指令进行计数,将count存储在esp寄存器中。如果执行完3条指令,则记录第3条指令地址,退出模拟执行。

  跳转结果存储在instruction_3列表中。

  假设所有混淆指令的执行结果都是不跳转,通过顺序执行ins_3 = next(ins)的方式获取第3条指令的地址。将获取到的地址与模拟执行结果instruction_3列表中的地址进行比较,如果相等,则并未发生跳转,如果不相等则发生了跳转。顺序执行时有的地址无法进行汇编,将地址值赋值为0。

  未发生跳转则将混淆指令test/cmp + 跳转指令赋值为0x90,发生跳转,则将混淆指令与目的跳转地址中间的数据全部赋值为0x90,中间的数据是垃圾数据,如果只将js等指令替换成“jmp 目的地址”,会影响程序的反汇编。

  使用函数set_bytes_at_rva(rva, data)修改PE 映像中的数据,并写入文件

[1]. 使用和海莲花相似混淆手法的攻击样本分析
https://ti.qianxin.com/blog/articles/Obfuscation-techniques-similar-to-OceanLotus/
[2]. Type2_Deobfuscate.py
https://github.com/levanvn/APT32_Deobfuscate/blob/master/Type2/Script/Type2_Deobfuscate.py
[3]. pefile.py
https://github.com/erocarrera/pefile/blob/master/pefile.py
[4]. 菜鸟读capstone与keystone源码入门
https://bbs.pediy.com/thread-258473-1.htm
[5]. Unicorn引擎教程
https://bbs.pediy.com/thread-224330.htm#msg_header_h3_7

 
 
 
 
 
reloc_table_num = pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_BASERELOC'
 
# 获取重定位表的VirtualAddress和Size 
reloc_table = pe.OPTIONAL_HEADER.DATA_DIRECTORY[reloc_table_num] 
reloc_table_rva = reloc_table.VirtualAddress 
reloc_table_size = reloc_table.Size 
print(f'重定位表RAV:{reloc_table_rva:#x},重定位表大小:{reloc_table_size:#x}')
reloc_table_num = pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_BASERELOC'
 
# 获取重定位表的VirtualAddress和Size 
reloc_table = pe.OPTIONAL_HEADER.DATA_DIRECTORY[reloc_table_num] 
reloc_table_rva = reloc_table.VirtualAddress 
reloc_table_size = reloc_table.Size 
print(f'重定位表RAV:{reloc_table_rva:#x},重定位表大小:{reloc_table_size:#x}')
 
relocations = pe.parse_relocations_directory(reloc_table_rva, reloc_table_size) 
    reloc_data_rva = [] 
    for i in relocations: 
        for j in i.entries: 
            # print(f'重定位数据RVA:{j.rva:#x}') 
            reloc_data_rva.append(j.rva)
relocations = pe.parse_relocations_directory(reloc_table_rva, reloc_table_size) 
    reloc_data_rva = [] 
    for i in relocations: 
        for j in i.entries: 
            # print(f'重定位数据RVA:{j.rva:#x}') 
            reloc_data_rva.append(j.rva)
class BaseRelocationData(DataContainer): 
    """Holds base relocation information. 
    struct:     IMAGE_BASE_RELOCATION structure 
    entries:    list of relocation data (RelocationData instances) 
""" 
 
class RelocationData(DataContainer): 
    """Holds relocation information. 
    type:       Type of relocation 
                The type string can be obtained by 
                RELOCATION_TYPE[type] 
    rva:        RVA of the relocation 
    """
class BaseRelocationData(DataContainer): 
    """Holds base relocation information. 
    struct:     IMAGE_BASE_RELOCATION structure 
    entries:    list of relocation data (RelocationData instances) 
""" 
 
class RelocationData(DataContainer): 
    """Holds relocation information. 
    type:       Type of relocation 
                The type string can be obtained by 
                RELOCATION_TYPE[type] 
    rva:        RVA of the relocation 
    """
 
 
 
 
 
branch = ["JZ", "JP", "JO", "JS", "JG", "JB", "JA", "JL", "JE", "JNZ", "JNP", "JNO", "JNS", "JLE", "JNB", "JBE"
              "JGE", "JNE", "JAE"
    b = 3 
    for i in range(3): 
        code = memory_data[reloc_data_rva - b: reloc_data_rva - b + 40
        if b == 3 and code[0] != 0x66
            b = b - 1 
            continue 
        b = b - 1 
        try
            ins = md.disasm(code, ImageBase + reloc_data_rva-b-1
 
            ins_1 = next(ins) 
            ins_2 = next(ins) 
            ins.close() 
        except StopIteration: 
            continue 
        if (ins_1.mnemonic == 'cmp' or ins_1.mnemonic == 'test') and ins_2.mnemonic.upper() in branch \ 
                and len(ins_1.operands) == 2 and ins_1.operands[0].type == X86_OP_MEM and ins_1.operands[1].type == X86_OP_IMM: 
            return return ins_1.address-0x10000000 
    return 0
branch = ["JZ", "JP", "JO", "JS", "JG", "JB", "JA", "JL", "JE", "JNZ", "JNP", "JNO", "JNS", "JLE", "JNB", "JBE"
              "JGE", "JNE", "JAE"
    b = 3 
    for i in range(3): 
        code = memory_data[reloc_data_rva - b: reloc_data_rva - b + 40
        if b == 3 and code[0] != 0x66
            b = b - 1 
            continue 
        b = b - 1 
        try
            ins = md.disasm(code, ImageBase + reloc_data_rva-b-1
 
            ins_1 = next(ins) 
            ins_2 = next(ins) 
            ins.close() 
        except StopIteration: 
            continue 
        if (ins_1.mnemonic == 'cmp' or ins_1.mnemonic == 'test') and ins_2.mnemonic.upper() in branch \ 
                and len(ins_1.operands) == 2 and ins_1.operands[0].type == X86_OP_MEM and ins_1.operands[1].type == X86_OP_IMM: 
            return return ins_1.address-0x10000000 
    return 0
 
 
pe = pefile.PE(filename, fast_load=True
content = pe.get_memory_mapped_image() 
mu.mem_write(0x10000000, pe.get_memory_mapped_image())
pe = pefile.PE(filename, fast_load=True
content = pe.get_memory_mapped_image() 
mu.mem_write(0x10000000, pe.get_memory_mapped_image())
 
instruction_3 = [] 
def hook_code(mu, address, size, userdata): 
    print(f'>>> Tracing instruction at {address:#x}, instruction size = {size:#x}'
    r_esp = mu.reg_read(UC_X86_REG_ESP) 
    count = u32(mu.mem_read(r_esp + 4, 4)) 
    print(f'count is {count}'
    if count == 2
        instruction_3.append(address) 
        mu.emu_stop() 
        try
            exit() 
        except BaseException as e: 
            print(e) 
    count = count + 1 
mu.mem_write(r_esp + 4, p32(count)) 
 
def simulate_execute(ins_addr_rva): 
mu.mem_write(r_esp + 4, p32(0)) 
    mu.emu_start(ins_addr_rva + ImageBase, 0x100066E6)
instruction_3 = [] 
def hook_code(mu, address, size, userdata): 
    print(f'>>> Tracing instruction at {address:#x}, instruction size = {size:#x}'
    r_esp = mu.reg_read(UC_X86_REG_ESP) 
    count = u32(mu.mem_read(r_esp + 4, 4)) 
    print(f'count is {count}'
    if count == 2
        instruction_3.append(address) 
        mu.emu_stop() 
        try
            exit() 
        except BaseException as e: 
            print(e) 
    count = count + 1 
mu.mem_write(r_esp + 4, p32(count)) 
 
def simulate_execute(ins_addr_rva): 
mu.mem_write(r_esp + 4, p32(0)) 
    mu.emu_start(ins_addr_rva + ImageBase, 0x100066E6)
reloc_data_rva = get_reloc_data_rva(pe) 
for rva in reloc_data_rva: 
    ins_addr_rva = get_intruction_start_rva(memory_mapped_image, rva, ImageBase) 
    if ins_addr_rva != 0
        simulate_execute(ins_addr_rva)
reloc_data_rva = get_reloc_data_rva(pe) 
for rva in reloc_data_rva: 
    ins_addr_rva = get_intruction_start_rva(memory_mapped_image, rva, ImageBase) 
    if ins_addr_rva != 0
        simulate_execute(ins_addr_rva)
# 获取按顺序执行时第3条指令地址 
        code = memory_mapped_image[ins_addr_rva:ins_addr_rva+40
        ins = md.disasm(code, ImageBase + ins_addr_rva) 
 
        ins_1 = next(ins) 
        ins_2 = next(ins) 
 
        try
            ins_3 = next(ins) 
            ins_3_address = ins_3.address 
        except
            ins_3_address = 0 
        ins.close()
# 获取按顺序执行时第3条指令地址 
        code = memory_mapped_image[ins_addr_rva:ins_addr_rva+40
        ins = md.disasm(code, ImageBase + ins_addr_rva) 
 
        ins_1 = next(ins) 
        ins_2 = next(ins) 
 
        try
            ins_3 = next(ins) 
            ins_3_address = ins_3.address 
        except
            ins_3_address = 0 
        ins.close()
if instruction_3[count] == ins_3_address: 
            size = ins_1.size + ins_2.size 
            assembly = b'\x90' * size 
            patch(memory_mapped_image, ImageBase, ImageBase + ins_addr_rva, assembly) 
        else
            size = instruction_3[count] - ins_1.address 
            assembly = b'\x90' * size 
            patch(memory_mapped_image, ImageBase, ImageBase + ins_addr_rva, assembly) 
        count = count + 1
if instruction_3[count] == ins_3_address: 
            size = ins_1.size + ins_2.size 
            assembly = b'\x90' * size 
            patch(memory_mapped_image, ImageBase, ImageBase + ins_addr_rva, assembly) 
        else
            size = instruction_3[count] - ins_1.address 
            assembly = b'\x90' * size 
            patch(memory_mapped_image, ImageBase, ImageBase + ins_addr_rva, assembly) 
        count = count + 1
for section in pe.sections: 
    print(f'{section.Name}, VirtualAddress: {section.VirtualAddress:#x}, ' 
          f'Size: {section.SizeOfRawData:#x}, 文件偏移: {section.PointerToRawData:#x}'
    pe.set_bytes_at_rva(section.VirtualAddress, 
                        bytes(memory_mapped_image[section.VirtualAddress:section.VirtualAddress + section.SizeOfRawData])) 
 
print('[+] Save to file ' + '1.bin'
pe.write('1.bin')
for section in pe.sections: 
    print(f'{section.Name}, VirtualAddress: {section.VirtualAddress:#x}, ' 
          f'Size: {section.SizeOfRawData:#x}, 文件偏移: {section.PointerToRawData:#x}'
    pe.set_bytes_at_rva(section.VirtualAddress, 
                        bytes(memory_mapped_image[section.VirtualAddress:section.VirtualAddress + section.SizeOfRawData])) 
 
print('[+] Save to file ' + '1.bin'
pe.write('1.bin')
# _*_ coding: utf-8 _*_ 
import pefile 
import struct 
from capstone.x86 import * 
from capstone import * 
from unicorn import * 
from unicorn.x86_const import * 
from binascii import * 
 
 
def u32(data): 
    return struct.unpack("I", data)[0
 
 
def p32(num): 
    return struct.pack("I", num) 
 
def patch(image, image_base, address, patch_data): 
    ''' 
 
    :param image: memory_mapped_image 从入口点开始处的数据 
    :param image_base: 基址 
    :param address: imagebase+rva VA 
    :param patch_data: 
    :return: 
    ''' 
    i = 0 
    for b in patch_data: 
        image[address - image_base + i] =
        i += 1 
 
 
 
 
# 获取重定位表的序号 pefile.py 146行 
def get_reloc_data_rva(pefile_struct): 
    ''' 
 
    :param pefile_struct: 
    :return: 返回所有重定位数据的RVA列表 
    ''' 
    reloc_table_num = pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_BASERELOC'
 
    # 获取重定位表的VirtualAddress和Size 
    reloc_table = pe.OPTIONAL_HEADER.DATA_DIRECTORY[reloc_table_num] 
    reloc_table_rva = reloc_table.VirtualAddress 
    reloc_table_size = reloc_table.Size 
    print(f'重定位表RAV:{reloc_table_rva:#x},重定位表大小:{reloc_table_size:#x}'
 
    # reloc_table由数个IMAGE_BASE_RELOCATION结构组成,每个结构由VirutalAddress(DWORD)、SizeOfBlock(DWORD)和TypeOffset(SizeOfBlock-8)组成 
    # parse_relocations_directory返回BaseRelocationData对象列表 
    relocations = pe.parse_relocations_directory(reloc_table_rva, reloc_table_size) 
    # 获取所有的重定位数据RVA 
    reloc_data_rva = [] 
    for i in relocations: 
        # BaseRelocationData有两个属性,struct和entries。 
        # struct是IMAGE_BASE_RELOCATION结构的VA和Size。 
        # entries:    list of relocation data (RelocationData instances) 
        # RelocationData: type和RVA 
        # print(i.struct) 
        for j in i.entries: 
            reloc_data_rva.append(j.rva) 
    return reloc_data_rva 
 
def get_intruction_start_rva(memory_data, reloc_data_rva, ImageBase): 
    ''' 
 
    :param memory_data:  映射到内存中的文件数据 
    :param reloc_data_rva: 重定位数据的rva 
    :param ImageBase: ImageBase 
    :return: 指令的rva 
    ''' 
 
    branch = ["JZ", "JP", "JO", "JS", "JG", "JB", "JA", "JL", "JE", "JNZ", "JNP", "JNO", "JNS", "JLE", "JNB", "JBE"
              "JGE", "JNE", "JAE"
    b = 3 
    for i in range(3): 
        code = memory_data[reloc_data_rva - b: reloc_data_rva - b + 40
        if b == 3 and code[0] != 0x66
            b = b - 1 
            continue 
        b = b - 1 
        try
            ins = md.disasm(code, ImageBase + reloc_data_rva-b-1
 
            ins_1 = next(ins) 
            ins_2 = next(ins) 
            ins.close() 
        except StopIteration: 
            continue 
        if (ins_1.mnemonic == 'cmp' or ins_1.mnemonic == 'test') and ins_2.mnemonic.upper() in branch \ 
                and len(ins_1.operands) == 2 and ins_1.operands[0].type == X86_OP_MEM and ins_1.operands[ 
            1].type == X86_OP_IMM: 
            return ins_1.address-0x10000000 
    return 0 
 
 
filename = 'bf3e495f43a6b333b10ae69667304cfd2c87e9100de9d31365671c7b6b93132e' 
pe = pefile.PE(filename, fast_load=True
 
memory_mapped_image = bytearray(pe.get_memory_mapped_image()) 
ImageBase = pe.OPTIONAL_HEADER.ImageBase 
 
print('[+] Map PE'
BASE = 0x10000000 
STACK_ADDR = 0x400000 
STACK_SIZE = 1024 * 1024 
 
mu = Uc(UC_ARCH_X86, UC_MODE_32) 
mu.mem_map(BASE, 1024 * 1024
mu.mem_map(STACK_ADDR, STACK_SIZE) 
 
r_esp = STACK_ADDR + STACK_SIZE // 2 
mu.reg_write(UC_X86_REG_ESP, STACK_ADDR + STACK_SIZE // 2
 
# 将文件映射到内存中 
 
mu.mem_write(0x10000000,pe.get_memory_mapped_image()) 
md = Cs(CS_ARCH_X86, CS_MODE_32) 
md.detail = True 
 
 
instruction_3 = [] 
def hook_code(mu, address, size, userdata): 
    print(f'>>> Tracing instruction at {address:#x}, instruction size = {size:#x}'
    r_esp = mu.reg_read(UC_X86_REG_ESP) 
    count = u32(mu.mem_read(r_esp + 4, 4)) 
    print(f'count is {count}'
 
    if count == 2
        instruction_3.append(address) 
        mu.emu_stop() 
        try
            exit() 
        except BaseException as e: 
            print(e) 
    count = count + 1 
    mu.mem_write(r_esp + 4, p32(count)) 
 
mu.hook_add(UC_HOOK_CODE, hook_code) 
 
 
 
def simulate_execute(ins_addr_rva): 
    mu.mem_write(r_esp + 4, p32(0)) 
 
    mu.emu_start(ins_addr_rva + ImageBase, 0x100066E6
 
reloc_data_rva = get_reloc_data_rva(pe) 
 
ins_addr_rva_all = [] 
count = 0 
for rva in reloc_data_rva: 
    ins_addr_rva = get_intruction_start_rva(memory_mapped_image, rva, ImageBase) 
    if ins_addr_rva != 0
        ins_addr_rva_all.append(ins_addr_rva) 
        simulate_execute(ins_addr_rva) 
        # 获取按顺序执行时第3条指令地址 
        code = memory_mapped_image[ins_addr_rva:ins_addr_rva+40
        ins = md.disasm(code, ImageBase + ins_addr_rva) 
 
        ins_1 = next(ins) 
        ins_2 = next(ins) 
 
        try
            ins_3 = next(ins) 
            ins_3_address = ins_3.address 
        except
            ins_3_address = 0 
        ins.close() 
 
        if instruction_3[count] == ins_3_address: 
            size = ins_1.size + ins_2.size 
            assembly = b'\x90' * size 
            patch(memory_mapped_image, ImageBase, ImageBase + ins_addr_rva, assembly) 
        else
            size = instruction_3[count] - ins_1.address 
            assembly = b'\x90' * size 
            patch(memory_mapped_image, ImageBase, ImageBase + ins_addr_rva, assembly) 
        count = count + 1 
 
 
for section in pe.sections: 
    print(f'{section.Name}, VirtualAddress: {section.VirtualAddress:#x}, ' 
          f'Size: {section.SizeOfRawData:#x}, 文件偏移: {section.PointerToRawData:#x}'
    pe.set_bytes_at_rva(section.VirtualAddress, 
                        bytes(memory_mapped_image[section.VirtualAddress:section.VirtualAddress + section.SizeOfRawData])) 
 
print('[+] Save to file ' + '1.bin'
pe.write('1.bin')
# _*_ coding: utf-8 _*_ 
import pefile 
import struct 
from capstone.x86 import * 
from capstone import * 
from unicorn import * 
from unicorn.x86_const import * 
from binascii import * 
 
 
def u32(data): 
    return struct.unpack("I", data)[0
 
 
def p32(num): 
    return struct.pack("I", num) 
 
def patch(image, image_base, address, patch_data): 
    ''' 
 
    :param image: memory_mapped_image 从入口点开始处的数据 
    :param image_base: 基址 
    :param address: imagebase+rva VA 
    :param patch_data: 
    :return: 
    ''' 
    i = 0 
    for b in patch_data: 
        image[address - image_base + i] =

[招生]科锐逆向工程师培训(2024年11月15日实地,远程教学同时开班, 第51期)

收藏
免费 6
支持
分享
打赏 + 50.00雪花
打赏次数 1 雪花 + 50.00
 
赞赏  Editor   +50.00 2022/05/16 恭喜您获得“雪花”奖励,安全圈有你而精彩!
最新回复 (1)
雪    币: 1151
活跃值: (434)
能力值: ( LV3,RANK:25 )
在线值:
发帖
回帖
粉丝
2

!

最后于 2022-5-18 20:36 被Recird_847682编辑 ,原因:
2022-5-18 11:23
0
游客
登录 | 注册 方可回帖
返回
//