[原创]海莲花glitch样本去混淆-软件逆向-看雪-安全社区|安全招聘|kanxue.com

[原创]海莲花glitch样本去混淆

发表于: 2022-4-14 21:48 14116

[原创]海莲花glitch样本去混淆

Tangdouren 活跃值

2022-4-14 21:48

14116

奇安信的报告《使用和海莲花相似混淆手法的攻击样本分析》[1]中分析了一个和APT32使用相同混淆方法的样本。本文根据奇安信的报告以及报告中提到的参考文章和代码[2]对该样本进行去混淆。

SHA256：bf3e495f43a6b333b10ae69667304cfd2c87e9100de9d31365671c7b6b93132e

如下图所示，cmp/test指令将数据段中存储的数据与立即数进行比较，下一条指令是条件跳转指令，根据比较结果来决定是否发生跳转。恶意代码通过这种方式来混淆控制流，影响分析人员进行逆向分析。

图 1-1 混淆代码

解决方法：

该dll当前载入的基址是0x10000000，与立即数(operation_2)进行比较的数据存储在地址operation_1（0x1007EE93），如果dll载入基址发生变化，则operation_1也会相应发生变化。为了使dll在基址发生改变时也能正确获取数据，使用dword_1007EE93的地址被记录在了重定位表中，当基址发生变化时，程序会根据重定位表中的地址修改operation_1。重定位表中保存了一大堆需要修正的代码的地址。

我们可以获取重定位表中存储的地址信息，通过判断该地址前面的指令是否为cmp/test来确定混淆指令的地址。模拟执行cmp/test、跳转指令后获取之后执行的指令地址。通过判断是否发生了跳转，将对应指令替换成NOP。

使用python的pefile库获取重定位表中存储的RVA。

图 2-1 重定位表

重定位表由数个IMAGE_BASE_RELOCATION结构组成，每个结构由VirutalAddress(DWORD)、SizeOfBlock(DWORD)和TypeOffset(SizeOfBlock-8)组成。重定位数据2个字节一组，高4位是类型，低12位是地址。低12位加上VirutalAddress是RVA。以第一个数据0x3031为例，低12位是0x031，加上0x1000，是0x1031。

parse_relocations_directory返回BaseRelocationData对象列表。BaseRelocationData有两个属性，struct和entries。struct是IMAGE_BASE_RELOCATION结构的VA和Size。entries是RelocationData对象列表，每一个RelocationData包含type和RVA，RVA是低12位加上VirutalAddress后的值。

图 2-2 BaseRelocationData的struct和RelocationData对象列表

图 2-3 重定位数据的RVA

reloc_data_rva列表中存储所有重定位数据RVA。

本部分代码引用自

https://github.com/levanvn/APT32_Deobfuscate/blob/master/Type2/Script/Type2_Deobfuscate.py

混淆指令有以下5种情况。

图 2-4 混淆指令

在默认操作数是32 位的OS 上，任何操作word 的指令都较操作dword 的指令长一个字节（Prefixes 0x66）。操作数前面的机器码长度是2到3字节。从使用重定位数据的地址往前3个字节或2个字节进行汇编，判断指令是否为cmp/test和跳转指令，如果是就获取到了混淆指令所在地址。

设置初值b为3，获取数据，如果往前3个字节开头是0x66，b减1，判断汇编代码，符合条件返回地址， reloc_data_rva - b – 1。如果开头不是0x66，b减1，判断往前2个字节的汇编代码，符合条件返回地址。

PE文件头中的FileAlignment定义了磁盘区块的对齐值，SectionAlignment定义了内存中区块的内存值。每一个区块从对齐值的倍数的偏移位置开始。

图 2-5 对齐值

图 2-6 文件映射到内存中的地址

pefile库中的函数get_memory_mapped_image()可以返回与PE文件的内存布局对应的数据

获取到混淆指令所在地址后，模拟执行3条指令，cmp/test、跳转指令和第3条指令，记录第3条指令的地址。

使用count对执行的指令进行计数，将count存储在esp寄存器中。如果执行完3条指令，则记录第3条指令地址，退出模拟执行。

跳转结果存储在instruction_3列表中。

假设所有混淆指令的执行结果都是不跳转，通过顺序执行ins_3 = next(ins)的方式获取第3条指令的地址。将获取到的地址与模拟执行结果instruction_3列表中的地址进行比较，如果相等，则并未发生跳转，如果不相等则发生了跳转。顺序执行时有的地址无法进行汇编，将地址值赋值为0。

未发生跳转则将混淆指令test/cmp + 跳转指令赋值为0x90，发生跳转，则将混淆指令与目的跳转地址中间的数据全部赋值为0x90，中间的数据是垃圾数据，如果只将js等指令替换成“jmp 目的地址”，会影响程序的反汇编。

使用函数set_bytes_at_rva(rva, data)修改PE 映像中的数据，并写入文件

[1]. 使用和海莲花相似混淆手法的攻击样本分析
https://ti.qianxin.com/blog/articles/Obfuscation-techniques-similar-to-OceanLotus/
[2]. Type2_Deobfuscate.py
https://github.com/levanvn/APT32_Deobfuscate/blob/master/Type2/Script/Type2_Deobfuscate.py
[3]. pefile.py
https://github.com/erocarrera/pefile/blob/master/pefile.py
[4]. 菜鸟读capstone与keystone源码入门
https://bbs.pediy.com/thread-258473-1.htm
[5]. Unicorn引擎教程
https://bbs.pediy.com/thread-224330.htm#msg_header_h3_7

reloc_table_num = pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_BASERELOC']  
 
# 获取重定位表的VirtualAddress和Size  

reloc_table = pe.OPTIONAL_HEADER.DATA_DIRECTORY[reloc_table_num]  

reloc_table_rva = reloc_table.VirtualAddress  

reloc_table_size = reloc_table.Size  

print(f'重定位表RAV：{reloc_table_rva:#x},重定位表大小：{reloc_table_size:#x}')

reloc_table_num = pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_BASERELOC']

# 获取重定位表的VirtualAddress和Size

reloc_table = pe.OPTIONAL_HEADER.DATA_DIRECTORY[reloc_table_num]

reloc_table_rva = reloc_table.VirtualAddress

reloc_table_size = reloc_table.Size

print(f'重定位表RAV：{reloc_table_rva:#x},重定位表大小：{reloc_table_size:#x}')

relocations = pe.parse_relocations_directory(reloc_table_rva, reloc_table_size)  

    reloc_data_rva = []  

    for i in relocations:  

        for j in i.entries:  

            # print(f'重定位数据RVA：{j.rva:#x}')  

            reloc_data_rva.append(j.rva)

relocations = pe.parse_relocations_directory(reloc_table_rva, reloc_table_size)

reloc_data_rva = []

for i in relocations:

for j in i.entries:

# print(f'重定位数据RVA：{j.rva:#x}')

reloc_data_rva.append(j.rva)

class BaseRelocationData(DataContainer):  

    """Holds base relocation information.  

    struct:     IMAGE_BASE_RELOCATION structure  

    entries:    list of relocation data (RelocationData instances)  

""" 
 
class RelocationData(DataContainer):  

    """Holds relocation information.  

    type:       Type of relocation  

                The type string can be obtained by  

                RELOCATION_TYPE[type]  

    rva:        RVA of the relocation  

    """

class BaseRelocationData(DataContainer):

"""Holds base relocation information.

struct: IMAGE_BASE_RELOCATION structure

entries: list of relocation data (RelocationData instances)

"""

class RelocationData(DataContainer):

"""Holds relocation information.

type: Type of relocation

The type string can be obtained by

RELOCATION_TYPE[type]

rva: RVA of the relocation

"""

branch = ["JZ", "JP", "JO", "JS", "JG", "JB", "JA", "JL", "JE", "JNZ", "JNP", "JNO", "JNS", "JLE", "JNB", "JBE",  

              "JGE", "JNE", "JAE"]  

    b = 3 

    for i in range(3):  

        code = memory_data[reloc_data_rva - b: reloc_data_rva - b + 40]  

        if b == 3 and code[0] != 0x66:  

            b = b - 1 

            continue 

        b = b - 1 

        try:  

            ins = md.disasm(code, ImageBase + reloc_data_rva-b-1)  
 
            ins_1 = next(ins)  

            ins_2 = next(ins)  

            ins.close()  

        except StopIteration:  

            continue 

        if (ins_1.mnemonic == 'cmp' or ins_1.mnemonic == 'test') and ins_2.mnemonic.upper() in branch \  

                and len(ins_1.operands) == 2 and ins_1.operands[0].type == X86_OP_MEM and ins_1.operands[1].type == X86_OP_IMM:  

            return return ins_1.address-0x10000000 

    return 0

branch = ["JZ", "JP", "JO", "JS", "JG", "JB", "JA", "JL", "JE", "JNZ", "JNP", "JNO", "JNS", "JLE", "JNB", "JBE",

"JGE", "JNE", "JAE"]

b = 3

for i in range(3):

code = memory_data[reloc_data_rva - b: reloc_data_rva - b + 40]

if b == 3 and code[0] != 0x66:

b = b - 1

continue

b = b - 1

try:

ins = md.disasm(code, ImageBase + reloc_data_rva-b-1)

ins_1 = next(ins)

ins_2 = next(ins)

ins.close()

except StopIteration:

continue

if (ins_1.mnemonic == 'cmp' or ins_1.mnemonic == 'test') and ins_2.mnemonic.upper() in branch \

and len(ins_1.operands) == 2 and ins_1.operands[0].type == X86_OP_MEM and ins_1.operands[1].type == X86_OP_IMM:

return return ins_1.address-0x10000000

return 0

pe = pefile.PE(filename, fast_load=True)  

content = pe.get_memory_mapped_image()  

mu.mem_write(0x10000000, pe.get_memory_mapped_image())

pe = pefile.PE(filename, fast_load=True)

content = pe.get_memory_mapped_image()

mu.mem_write(0x10000000, pe.get_memory_mapped_image())

instruction_3 = []  

def hook_code(mu, address, size, userdata):  

    print(f'>>> Tracing instruction at {address:#x}, instruction size = {size:#x}')  

    r_esp = mu.reg_read(UC_X86_REG_ESP)  

    count = u32(mu.mem_read(r_esp + 4, 4))  

    print(f'count is {count}')  

    if count == 2:  

        instruction_3.append(address)  

        mu.emu_stop()  

        try:  

            exit()  

        except BaseException as e:  

            print(e)  

    count = count + 1 

mu.mem_write(r_esp + 4, p32(count))  
 
def simulate_execute(ins_addr_rva):  

mu.mem_write(r_esp + 4, p32(0))  

    mu.emu_start(ins_addr_rva + ImageBase, 0x100066E6)

instruction_3 = []

def hook_code(mu, address, size, userdata):

print(f'>>> Tracing instruction at {address:#x}, instruction size = {size:#x}')

r_esp = mu.reg_read(UC_X86_REG_ESP)

count = u32(mu.mem_read(r_esp + 4, 4))

print(f'count is {count}')

if count == 2:

instruction_3.append(address)

mu.emu_stop()

try:

exit()

except BaseException as e:

print(e)

count = count + 1

mu.mem_write(r_esp + 4, p32(count))

def simulate_execute(ins_addr_rva):

mu.mem_write(r_esp + 4, p32(0))

mu.emu_start(ins_addr_rva + ImageBase, 0x100066E6)

reloc_data_rva = get_reloc_data_rva(pe)  

for rva in reloc_data_rva:  

    ins_addr_rva = get_intruction_start_rva(memory_mapped_image, rva, ImageBase)  

    if ins_addr_rva != 0:  

        simulate_execute(ins_addr_rva)

reloc_data_rva = get_reloc_data_rva(pe)

for rva in reloc_data_rva:

ins_addr_rva = get_intruction_start_rva(memory_mapped_image, rva, ImageBase)

if ins_addr_rva != 0:

simulate_execute(ins_addr_rva)

# 获取按顺序执行时第3条指令地址  

        code = memory_mapped_image[ins_addr_rva:ins_addr_rva+40]  

        ins = md.disasm(code, ImageBase + ins_addr_rva)  
 
        ins_1 = next(ins)  

        ins_2 = next(ins)  
 
        try:  

            ins_3 = next(ins)  

            ins_3_address = ins_3.address  

        except:  

            ins_3_address = 0 

        ins.close()

# 获取按顺序执行时第3条指令地址

code = memory_mapped_image[ins_addr_rva:ins_addr_rva+40]

ins = md.disasm(code, ImageBase + ins_addr_rva)

ins_1 = next(ins)

ins_2 = next(ins)

try:

ins_3 = next(ins)

ins_3_address = ins_3.address

except:

ins_3_address = 0

ins.close()

if instruction_3[count] == ins_3_address:  

            size = ins_1.size + ins_2.size  

            assembly = b'\x90' * size  

            patch(memory_mapped_image, ImageBase, ImageBase + ins_addr_rva, assembly)  

        else:  

            size = instruction_3[count] - ins_1.address  

            assembly = b'\x90' * size  

            patch(memory_mapped_image, ImageBase, ImageBase + ins_addr_rva, assembly)  

        count = count + 1

if instruction_3[count] == ins_3_address:

size = ins_1.size + ins_2.size

assembly = b'\x90' * size

patch(memory_mapped_image, ImageBase, ImageBase + ins_addr_rva, assembly)

else:

size = instruction_3[count] - ins_1.address

assembly = b'\x90' * size

patch(memory_mapped_image, ImageBase, ImageBase + ins_addr_rva, assembly)

count = count + 1

for section in pe.sections:  

    print(f'{section.Name}, VirtualAddress: {section.VirtualAddress:#x}, ' 

          f'Size: {section.SizeOfRawData:#x}, 文件偏移: {section.PointerToRawData:#x}')  

    pe.set_bytes_at_rva(section.VirtualAddress,  

                        bytes(memory_mapped_image[section.VirtualAddress:section.VirtualAddress + section.SizeOfRawData]))  
 
print('[+] Save to file ' + '1.bin')  

pe.write('1.bin')

for section in pe.sections:

print(f'{section.Name}, VirtualAddress: {section.VirtualAddress:#x}, '

f'Size: {section.SizeOfRawData:#x}, 文件偏移: {section.PointerToRawData:#x}')

pe.set_bytes_at_rva(section.VirtualAddress,

bytes(memory_mapped_image[section.VirtualAddress:section.VirtualAddress + section.SizeOfRawData]))

print('[+] Save to file ' + '1.bin')

pe.write('1.bin')

# _*_ coding: utf-8 _*_  

import pefile  

import struct  

from capstone.x86 import * 

from capstone import * 

from unicorn import * 

from unicorn.x86_const import * 

from binascii import * 
 
def u32(data):  

    return struct.unpack("I", data)[0]  
 
def p32(num):  

    return struct.pack("I", num)  
 
def patch(image, image_base, address, patch_data):  

    '''  
 
    :param image: memory_mapped_image 从入口点开始处的数据  

    :param image_base: 基址  

    :param address: imagebase+rva VA  

    :param patch_data:  

    :return:  

    ''' 

    i = 0 

    for b in patch_data:  

        image[address - image_base + i] = b  

        i += 1 
 
# 获取重定位表的序号 pefile.py 146行  

def get_reloc_data_rva(pefile_struct):  

    '''  
 
    :param pefile_struct:  

    :return: 返回所有重定位数据的RVA列表  

    ''' 

    reloc_table_num = pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_BASERELOC']  
 
    # 获取重定位表的VirtualAddress和Size  

    reloc_table = pe.OPTIONAL_HEADER.DATA_DIRECTORY[reloc_table_num]  

    reloc_table_rva = reloc_table.VirtualAddress  

    reloc_table_size = reloc_table.Size  

    print(f'重定位表RAV：{reloc_table_rva:#x},重定位表大小：{reloc_table_size:#x}')  
 
    # reloc_table由数个IMAGE_BASE_RELOCATION结构组成，每个结构由VirutalAddress(DWORD)、SizeOfBlock(DWORD)和TypeOffset(SizeOfBlock-8)组成  

    # parse_relocations_directory返回BaseRelocationData对象列表  

    relocations = pe.parse_relocations_directory(reloc_table_rva, reloc_table_size)  

    # 获取所有的重定位数据RVA  

    reloc_data_rva = []  

    for i in relocations:  

        # BaseRelocationData有两个属性，struct和entries。  

        # struct是IMAGE_BASE_RELOCATION结构的VA和Size。  

        # entries:    list of relocation data (RelocationData instances)  

        # RelocationData: type和RVA  

        # print(i.struct)  

        for j in i.entries:  

            reloc_data_rva.append(j.rva)  

    return reloc_data_rva  
 
def get_intruction_start_rva(memory_data, reloc_data_rva, ImageBase):  

    '''  
 
    :param memory_data:  映射到内存中的文件数据  

    :param reloc_data_rva: 重定位数据的rva  

    :param ImageBase: ImageBase  

    :return: 指令的rva  

    ''' 
 
    branch = ["JZ", "JP", "JO", "JS", "JG", "JB", "JA", "JL", "JE", "JNZ", "JNP", "JNO", "JNS", "JLE", "JNB", "JBE",  

              "JGE", "JNE", "JAE"]  

    b = 3 

    for i in range(3):  

        code = memory_data[reloc_data_rva - b: reloc_data_rva - b + 40]  

        if b == 3 and code[0] != 0x66:  

            b = b - 1 

            continue 

        b = b - 1 

        try:  

            ins = md.disasm(code, ImageBase + reloc_data_rva-b-1)  
 
            ins_1 = next(ins)  

            ins_2 = next(ins)  

            ins.close()  

        except StopIteration:  

            continue 

        if (ins_1.mnemonic == 'cmp' or ins_1.mnemonic == 'test') and ins_2.mnemonic.upper() in branch \  

                and len(ins_1.operands) == 2 and ins_1.operands[0].type == X86_OP_MEM and ins_1.operands[  

            1].type == X86_OP_IMM:  

            return ins_1.address-0x10000000 

    return 0 
 
filename = 'bf3e495f43a6b333b10ae69667304cfd2c87e9100de9d31365671c7b6b93132e' 

pe = pefile.PE(filename, fast_load=True)  
 
memory_mapped_image = bytearray(pe.get_memory_mapped_image())  

ImageBase = pe.OPTIONAL_HEADER.ImageBase  
 
print('[+] Map PE')  

BASE = 0x10000000 

STACK_ADDR = 0x400000 

STACK_SIZE = 1024 * 1024 
 
mu = Uc(UC_ARCH_X86, UC_MODE_32)  

mu.mem_map(BASE, 1024 * 1024)  
mu.mem_map(STACK_ADDR, STACK_SIZE)  
 
r_esp = STACK_ADDR + STACK_SIZE // 2 

mu.reg_write(UC_X86_REG_ESP, STACK_ADDR + STACK_SIZE // 2)  
 
# 将文件映射到内存中  
 
mu.mem_write(0x10000000,pe.get_memory_mapped_image())  

md = Cs(CS_ARCH_X86, CS_MODE_32)  

md.detail = True 
 
instruction_3 = []  

def hook_code(mu, address, size, userdata):  

    print(f'>>> Tracing instruction at {address:#x}, instruction size = {size:#x}')  

    r_esp = mu.reg_read(UC_X86_REG_ESP)  

    count = u32(mu.mem_read(r_esp + 4, 4))  

    print(f'count is {count}')  
 
    if count == 2:  

        instruction_3.append(address)  

        mu.emu_stop()  

        try:  

            exit()  

        except BaseException as e:  

            print(e)  

    count = count + 1 

    mu.mem_write(r_esp + 4, p32(count))  
 
mu.hook_add(UC_HOOK_CODE, hook_code)  
 
def simulate_execute(ins_addr_rva):  

    mu.mem_write(r_esp + 4, p32(0))  
 
    mu.emu_start(ins_addr_rva + ImageBase, 0x100066E6)  
 
reloc_data_rva = get_reloc_data_rva(pe)  
 
ins_addr_rva_all = []  

count = 0 

for rva in reloc_data_rva:  

    ins_addr_rva = get_intruction_start_rva(memory_mapped_image, rva, ImageBase)  

    if ins_addr_rva != 0:  

        ins_addr_rva_all.append(ins_addr_rva)  

        simulate_execute(ins_addr_rva)  

        # 获取按顺序执行时第3条指令地址  

        code = memory_mapped_image[ins_addr_rva:ins_addr_rva+40]  

        ins = md.disasm(code, ImageBase + ins_addr_rva)  
 
        ins_1 = next(ins)  

        ins_2 = next(ins)  
 
        try:  

            ins_3 = next(ins)  

            ins_3_address = ins_3.address  

        except:  

            ins_3_address = 0 

        ins.close()  
 
        if instruction_3[count] == ins_3_address:  

            size = ins_1.size + ins_2.size  

            assembly = b'\x90' * size  

            patch(memory_mapped_image, ImageBase, ImageBase + ins_addr_rva, assembly)  

        else:  

            size = instruction_3[count] - ins_1.address  

            assembly = b'\x90' * size  

            patch(memory_mapped_image, ImageBase, ImageBase + ins_addr_rva, assembly)  

        count = count + 1 
 
for section in pe.sections:  

    print(f'{section.Name}, VirtualAddress: {section.VirtualAddress:#x}, ' 

          f'Size: {section.SizeOfRawData:#x}, 文件偏移: {section.PointerToRawData:#x}')  

    pe.set_bytes_at_rva(section.VirtualAddress,  

                        bytes(memory_mapped_image[section.VirtualAddress:section.VirtualAddress + section.SizeOfRawData]))  
 
print('[+] Save to file ' + '1.bin')  

pe.write('1.bin')

# _*_ coding: utf-8 _*_

import pefile

import struct

from capstone.x86 import *

from capstone import *

from unicorn import *

from unicorn.x86_const import *

from binascii import *

def u32(data):

return struct.unpack("I", data)[0]

def p32(num):

return struct.pack("I", num)

def patch(image, image_base, address, patch_data):

'''

:param image: memory_mapped_image 从入口点开始处的数据

:param image_base: 基址

:param address: imagebase+rva VA

:param patch_data:

:return:

'''

i = 0

for b in patch_data:

image[address - image_base + i] = b

登录后可查看完整内容

[培训]内核驱动高级班，冲击BAT一流互联网大厂工作，每周日13:00-18:00直播授课

#调试逆向

收藏・10

免费・6

支持

打赏 + 50.00雪花

Editor

打赏次数 1

雪花 + 50.00

Editor

+50.00

2022/05/16

恭喜您获得“雪花”奖励，安全圈有你而精彩！

最新回复 (1)
Recird_847682 雪币： 1151 活跃值： (434) 能力值： ( LV3，RANK：25 ) 在线值：发帖 0 回帖 7 粉丝 1 关注私信	Recird_847682 2 楼 ! 最后于 2022-5-18 20:36 被Recird_847682编辑，原因： 2022-5-18 11:23 0
	游客登录 \| 注册方可回帖回帖表情雪币赚取及消费高级回复

Tangdouren

发帖

回帖

RANK

关注

私信

他的文章

[原创]海莲花glitch样本去混淆 14117

关于我们

联系我们

企业服务

看雪公众号