我们可以认为一个程序的代码结构如下图所示:
一个程序由多个函数(function)组成,而每个函数由多个分支(branch)组成,对于函数和分支我们做如下定义:
因此去混淆的时候我们可以有如下代码框架,即先 bfs 函数,然后在每个函数内部再 bfs 所有分支。在 bfs 的过程中将已去混淆的代码拼接起来。这样做的好处是同一个函数的代码尽可能放在一起,ida 在反编译的时候容易识别。
代码的位置移动时,原本的 CALL 和 JCC 等跳转指令要想跳转到原来的地方需要进行指令修正,这个可以借助 keystone-engine 和 capstone 来完成。
然而在完成去混淆后程序中的绝大多数代码都移动了位置,因此程序中所有的 CALL 和 JCC 等跳转指令跳转的地址需要进行修正,也就是重定位。
对于指令修正我们可以通过并查集来维护。
一个程序的跳转指令可以看做是上图左边的结构。即存在一个跳转指令跳转到另一个跳转指令的情况。通过并查集我们可以将指令 A,B,C,D,E 的真实地址都修正为指令 E 的真实地址。
在使用并查集维护重定位的时候需要注意以下几点:
附件下载链接
观察发现程序由下面的代码块构成:
分析该代码块的执行过程,发现本质是在一个 switch 中查找实际指令。该代码块可由 lea ecx, [esp+4]
指令代替。
首先,我们需要将程序中的代码块提取出来,然后记录几个有用的信息:
在提取代码块的有效信息的同时也可以检测该代码块是否有效,因此分析发现程序中会在代码块直接插入一些有实际功能的代码。
在提取出代码块之后利用提取到的有效信息可以在 call_target
中查找代码块对应的实际代码。这里有几个特殊情况:
这里涉及到了维护重定位的并查集 RelocDSU
,对应代码如下。在 get
函数中如果遇到了 jmp 指令且操作数是立即数就路径压缩到跳转的地址,直到地址在 .got.plt
或者指令不是 jmp 指令。另外判断是否是已处理代码是根据地址对应的最终地址是否不在 .text
段。
接下来就是考虑如何提取出一个 branch 的代码了。前面提到过程序中会在代码块直接插入一些有实际功能的代码,因此需要借助 try:...except:...
和 assert
来处理。除此之外这里还有几个特殊情况:
能够处理 branch 后,我们就可以 bfs 依次处理所有的 function 和 branch 了,这里还有几个特殊情况:
在完成代码去混淆之后需要对代码进行重定位,重定位的时候需要注意 jmp 指令长度的变化。
最后去混淆后的 switch 不能被 ida 正常识别出来,具体原因是前面获取返回地址 eip 的函数被 patch 成了 mov reg, xxx
指令,导致其与编译器默认编译出的汇编不同(程序开启了 PIE,直接访问跳转表的地址 ida 不能正确识别),因此需要将这里的代码重新 patch 回去。
同时为了不影响原本程序中的数据,这里我将修复的跳转表放到了其他位置。另外还有两个字符串全局变量也移动到了正确位置。
最终去混淆脚本如下:
附件下载链接
首先将 0x140010074
,0x140017EFA
,140018C67
起始处的数据转换为汇编。
观察汇编,发现很多代码块之间相互跳转,因此先按照 retn
划分代码块。通过对代码块的观察,发现这些代码块按照 call $+5;pop rax
(即 E8 00 00 00 00 58
) 的出现次数可以分为三种:
出现 0 次:
本质上是 其它操作
+ retn
。
出现 1 次:
这种代码块本质为 其它操作
+ jmp target
,注意 其它操作
中可能包含 branch 。
出现 2 次:
这个可以看做 2 个出现 1 次的代码块两个拼在一起,其中前面一个代码块去掉 retn
。执行完前面一个代码块后由于没有 retn
,因此 target1
留在栈中。执行第 2 个代码块跳转到 target2
执行 ,在 target2
代码块返回时会返回到 target1
。因此这种代码块本质上相当于 其它操作
+ call target2
且下一个要执行的代码块为 target1
。
我们定义代码块 Block
几个关键信息:
get_block
函数可以获取给定地址处的代码块并提取相关信息。代码块中可能有 push xxx;pop xxx;
这样的无意义指令,可以通过栈模拟来去除。
能够获取代码块信息之后就可以 bfs 函数以及函数中的所有分支,提取出汇编代码并写入 newcode
段。这里需要注意以下几点:
最后对代码进行重定位,需要注意的是代码块中的有效指令中也可能有 call 指令,这里 call 调用的是一个类似 plt 表的结构,会直接跳转到导入表中的函数地址表指向的函数,需要特判这种情况。
最后完整代码:
func_queue
=
Queue()
func_queue.put(entry_point)
while
not
func_queue.empty():
func_address
=
func_queue.get()
branch_queue
=
Queue()
branch_queue.put(func_address)
while
not
branch_queue.empty():
branch_address
=
branch_queue.get()
...
if
idc.print_insn_mnem(ea)
=
=
'call'
:
func_queue.put(call_target)
elif
idc.print_insn_mnem(ea)[
0
]
=
=
'j'
branch_queue.put(jcc_target)
...
func_queue
=
Queue()
func_queue.put(entry_point)
while
not
func_queue.empty():
func_address
=
func_queue.get()
branch_queue
=
Queue()
branch_queue.put(func_address)
while
not
branch_queue.empty():
branch_address
=
branch_queue.get()
...
if
idc.print_insn_mnem(ea)
=
=
'call'
:
func_queue.put(call_target)
elif
idc.print_insn_mnem(ea)[
0
]
=
=
'j'
branch_queue.put(jcc_target)
...
def
mov_code(ea, new_code_ea):
return
asm(disasm(idc.get_bytes(ea, idc.get_item_size(ea)), ea), new_code_ea)
def
mov_code(ea, new_code_ea):
return
asm(disasm(idc.get_bytes(ea, idc.get_item_size(ea)), ea), new_code_ea)
.text:
000048F4
pushf
.text:
000048F5
pusha
.text:
000048F6
mov cl,
3Fh
;
'?'
.text:
000048F8
call sub_44FA
.text:
000048F8
.text:
000048FD
pop eax
.text:
000048F4
pushf
.text:
000048F5
pusha
.text:
000048F6
mov cl,
3Fh
;
'?'
.text:
000048F8
call sub_44FA
.text:
000048F8
.text:
000048FD
pop eax
class
Block:
def
__init__(
self
, start_ea, end_ea, imm, reg, call_target):
self
.start_ea
=
start_ea
self
.end_ea
=
end_ea
self
.imm
=
imm
self
.reg
=
reg
self
.call_target
=
call_target
def
get_block(start_ea):
global
imm, reg, call_target
mnem_list
=
[
'pushf'
,
'pusha'
,
'mov'
,
'call'
,
'pop'
]
ea
=
start_ea
for
i
in
range
(
5
):
mnem
=
idc.print_insn_mnem(ea)
assert
mnem
=
=
mnem_list[i]
if
mnem
=
=
'mov'
:
imm
=
idc.get_operand_value(ea,
1
)
reg
=
idc.print_operand(ea,
0
)
elif
mnem
=
=
'call'
:
call_target
=
idc.get_operand_value(ea,
0
)
ea
+
=
idc.get_item_size(ea)
return
Block(start_ea, ea, imm, reg, call_target)
class
Block:
def
__init__(
self
, start_ea, end_ea, imm, reg, call_target):
self
.start_ea
=
start_ea
self
.end_ea
=
end_ea
self
.imm
=
imm
self
.reg
=
reg
self
.call_target
=
call_target
def
get_block(start_ea):
global
imm, reg, call_target
mnem_list
=
[
'pushf'
,
'pusha'
,
'mov'
,
'call'
,
'pop'
]
ea
=
start_ea
for
i
in
range
(
5
):
mnem
=
idc.print_insn_mnem(ea)
assert
mnem
=
=
mnem_list[i]
if
mnem
=
=
'mov'
:
imm
=
idc.get_operand_value(ea,
1
)
reg
=
idc.print_operand(ea,
0
)
elif
mnem
=
=
'call'
:
call_target
=
idc.get_operand_value(ea,
0
)
ea
+
=
idc.get_item_size(ea)
return
Block(start_ea, ea, imm, reg, call_target)
.text:
000045CC
popa
.text:
000045CD
popf
.text:
000045CE
pushf
.text:
000045CF
pusha
.text:
000045D0
call dec_index
.text:
000045D0
.text:
000045D5
popa
.text:
000045D6
popf
.text:
000045D7
retn
.text:
000045CC
popa
.text:
000045CD
popf
.text:
000045CE
pushf
.text:
000045CF
pusha
.text:
000045D0
call dec_index
.text:
000045D0
.text:
000045D5
popa
.text:
000045D6
popf
.text:
000045D7
retn
def
get_real_code(block, new_code_ea):
ea
=
block.call_target
while
True
:
if
idc.print_insn_mnem(ea)
=
=
'cmp'
:
reg
=
idc.print_operand(ea,
0
)
imm
=
idc.get_operand_value(ea,
1
)
if
reg
=
=
block.reg
and
imm
=
=
block.imm:
ea
+
=
idc.get_item_size(ea)
break
ea
+
=
idc.get_item_size(ea)
assert
idc.print_insn_mnem(ea)
=
=
'jnz'
ea
+
=
idc.get_item_size(ea)
assert
idc.print_insn_mnem(ea)
=
=
'popa'
ea
+
=
idc.get_item_size(ea)
assert
idc.print_insn_mnem(ea)
=
=
'popf'
ea
+
=
idc.get_item_size(ea)
if
idc.print_insn_mnem(ea)
=
=
'pushf'
:
return
True
, asm(
'ret'
)
new_code
=
b''
while
True
:
if
idc.print_insn_mnem(ea)
=
=
'jmp'
:
jmp_ea
=
idc.get_operand_value(ea,
0
)
if
idc.print_insn_mnem(jmp_ea)
=
=
'pushf'
:
break
ea
=
jmp_ea
else
:
code
=
mov_code(ea, new_code_ea)
new_code
+
=
code
new_code_ea
+
=
len
(code)
ea
+
=
get_item_size(ea)
return
False
, new_code
def
get_real_code(block, new_code_ea):
ea
=
block.call_target
while
True
:
if
idc.print_insn_mnem(ea)
=
=
'cmp'
:
reg
=
idc.print_operand(ea,
0
)
imm
=
idc.get_operand_value(ea,
1
)
if
reg
=
=
block.reg
and
imm
=
=
block.imm:
ea
+
=
idc.get_item_size(ea)
break
ea
+
=
idc.get_item_size(ea)
assert
idc.print_insn_mnem(ea)
=
=
'jnz'
ea
+
=
idc.get_item_size(ea)
assert
idc.print_insn_mnem(ea)
=
=
'popa'
ea
+
=
idc.get_item_size(ea)
assert
idc.print_insn_mnem(ea)
=
=
'popf'
ea
+
=
idc.get_item_size(ea)
if
idc.print_insn_mnem(ea)
=
=
'pushf'
:
return
True
, asm(
'ret'
)
new_code
=
b''
while
True
:
if
idc.print_insn_mnem(ea)
=
=
'jmp'
:
jmp_ea
=
idc.get_operand_value(ea,
0
)
if
idc.print_insn_mnem(jmp_ea)
=
=
'pushf'
:
break
ea
=
jmp_ea
else
:
code
=
mov_code(ea, new_code_ea)
new_code
+
=
code
new_code_ea
+
=
len
(code)
ea
+
=
get_item_size(ea)
return
False
, new_code
class
RelocDSU:
def
__init__(
self
):
self
.reloc
=
{}
def
get(
self
, ea):
if
ea
not
in
self
.reloc:
if
idc.print_insn_mnem(ea)
=
=
'jmp'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jmp_ea
=
idc.get_operand_value(ea,
0
)
if
idc.get_segm_name(jmp_ea)
=
=
'.got.plt'
:
self
.reloc[ea]
=
ea
return
self
.reloc[ea],
False
self
.reloc[ea], need_handle
=
self
.get(idc.get_operand_value(ea,
0
))
return
self
.reloc[ea], need_handle
else
:
self
.reloc[ea]
=
ea
if
self
.reloc[ea] !
=
ea:
self
.reloc[ea]
=
self
.get(
self
.reloc[ea])[
0
]
return
self
.reloc[ea], idc.get_segm_name(
self
.reloc[ea])
=
=
'.text'
def
merge(
self
, ea, reloc_ea):
self
.reloc[
self
.get(ea)[
0
]]
=
self
.get(reloc_ea)[
0
]
reloc
=
RelocDSU()
class
RelocDSU:
def
__init__(
self
):
self
.reloc
=
{}
def
get(
self
, ea):
if
ea
not
in
self
.reloc:
if
idc.print_insn_mnem(ea)
=
=
'jmp'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jmp_ea
=
idc.get_operand_value(ea,
0
)
if
idc.get_segm_name(jmp_ea)
=
=
'.got.plt'
:
self
.reloc[ea]
=
ea
return
self
.reloc[ea],
False
self
.reloc[ea], need_handle
=
self
.get(idc.get_operand_value(ea,
0
))
return
self
.reloc[ea], need_handle
else
:
self
.reloc[ea]
=
ea
if
self
.reloc[ea] !
=
ea:
self
.reloc[ea]
=
self
.get(
self
.reloc[ea])[
0
]
return
self
.reloc[ea], idc.get_segm_name(
self
.reloc[ea])
=
=
'.text'
def
merge(
self
, ea, reloc_ea):
self
.reloc[
self
.get(ea)[
0
]]
=
self
.get(reloc_ea)[
0
]
reloc
=
RelocDSU()
def
handle_one_branch(branch_address, new_code_ea):
new_code
=
b''
ea
=
branch_address
while
True
:
try
:
block
=
get_block(ea)
is_ret, real_code
=
get_real_code(block, new_code_ea)
reloc.merge(ea, new_code_ea)
ea
=
block.end_ea
new_code_ea
+
=
len
(real_code)
new_code
+
=
real_code
if
is_ret:
break
except
:
get_eip_func
=
{
0x900
:
'ebx'
,
0x435c
:
'eax'
}
if
idc.print_insn_mnem(ea)
=
=
'call'
and
get_operand_value(ea,
0
)
in
get_eip_func:
reloc.merge(ea, new_code_ea)
real_code
=
asm(
'mov %s, 0x%x'
%
(get_eip_func[get_operand_value(ea,
0
)], ea
+
5
), new_code_ea)
else
:
if
idc.print_insn_mnem(ea)
=
=
'jmp'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
reloc.merge(new_code_ea, ea)
else
:
reloc.merge(ea, new_code_ea)
real_code
=
mov_code(ea, new_code_ea)
new_code
+
=
real_code
if
real_code
=
=
asm(
'ret'
):
break
new_code_ea
+
=
len
(real_code)
if
idc.print_insn_mnem(ea)
=
=
'jmp'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jmp_ea
=
idc.get_operand_value(ea,
0
)
if
reloc.get(jmp_ea)[
1
]
=
=
False
:
break
ea
=
reloc.get(jmp_ea)[
0
]
else
:
ea
+
=
get_item_size(ea)
return
new_code
def
handle_one_branch(branch_address, new_code_ea):
new_code
=
b''
ea
=
branch_address
while
True
:
try
:
block
=
get_block(ea)
is_ret, real_code
=
get_real_code(block, new_code_ea)
reloc.merge(ea, new_code_ea)
ea
=
block.end_ea
new_code_ea
+
=
len
(real_code)
new_code
+
=
real_code
if
is_ret:
break
except
:
get_eip_func
=
{
0x900
:
'ebx'
,
0x435c
:
'eax'
}
if
idc.print_insn_mnem(ea)
=
=
'call'
and
get_operand_value(ea,
0
)
in
get_eip_func:
reloc.merge(ea, new_code_ea)
real_code
=
asm(
'mov %s, 0x%x'
%
(get_eip_func[get_operand_value(ea,
0
)], ea
+
5
), new_code_ea)
else
:
if
idc.print_insn_mnem(ea)
=
=
'jmp'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
reloc.merge(new_code_ea, ea)
else
:
reloc.merge(ea, new_code_ea)
real_code
=
mov_code(ea, new_code_ea)
new_code
+
=
real_code
if
real_code
=
=
asm(
'ret'
):
break
new_code_ea
+
=
len
(real_code)
if
idc.print_insn_mnem(ea)
=
=
'jmp'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jmp_ea
=
idc.get_operand_value(ea,
0
)
if
reloc.get(jmp_ea)[
1
]
=
=
False
:
break
ea
=
reloc.get(jmp_ea)[
0
]
else
:
ea
+
=
get_item_size(ea)
return
new_code
func_queue
=
Queue()
func_queue.put(entry_point)
while
not
func_queue.empty():
func_address
=
func_queue.get()
if
reloc.get(func_address)[
1
]
=
=
False
:
continue
reloc.merge(func_address, new_code_ea)
branch_queue
=
Queue()
branch_queue.put(func_address)
if
func_address
=
=
0x4148
:
assert
new_code_ea
=
=
0x963d0
for
eax
in
range
(
0x20
):
jmp_target
=
(ida_bytes.get_dword(jmp_table[
0
]
+
eax
*
4
)
+
jmp_table[
1
]) &
0xFFFFFFFF
new_jmp_target, need_handle
=
reloc.get(jmp_target)
if
need_handle: branch_queue.put(jmp_target)
while
not
branch_queue.empty():
branch_address
=
branch_queue.get()
new_code
=
handle_one_branch(branch_address, new_code_ea)
ida_bytes.patch_bytes(new_code_ea, new_code)
ea
=
new_code_ea
while
ea < new_code_ea
+
len
(new_code):
idc.create_insn(ea)
if
idc.print_insn_mnem(ea)
=
=
'call'
:
call_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
if
need_handle: func_queue.put(call_target)
elif
idc.print_insn_mnem(ea)[
0
]
=
=
'j'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jcc_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
if
need_handle
=
=
True
:
branch_queue.put(jcc_target)
ea
+
=
get_item_size(ea)
new_code_ea
+
=
len
(new_code)
func_queue
=
Queue()
func_queue.put(entry_point)
while
not
func_queue.empty():
func_address
=
func_queue.get()
if
reloc.get(func_address)[
1
]
=
=
False
:
continue
reloc.merge(func_address, new_code_ea)
branch_queue
=
Queue()
branch_queue.put(func_address)
if
func_address
=
=
0x4148
:
assert
new_code_ea
=
=
0x963d0
for
eax
in
range
(
0x20
):
jmp_target
=
(ida_bytes.get_dword(jmp_table[
0
]
+
eax
*
4
)
+
jmp_table[
1
]) &
0xFFFFFFFF
new_jmp_target, need_handle
=
reloc.get(jmp_target)
if
need_handle: branch_queue.put(jmp_target)
while
not
branch_queue.empty():
branch_address
=
branch_queue.get()
new_code
=
handle_one_branch(branch_address, new_code_ea)
ida_bytes.patch_bytes(new_code_ea, new_code)
ea
=
new_code_ea
while
ea < new_code_ea
+
len
(new_code):
idc.create_insn(ea)
if
idc.print_insn_mnem(ea)
=
=
'call'
:
call_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
if
need_handle: func_queue.put(call_target)
elif
idc.print_insn_mnem(ea)[
0
]
=
=
'j'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jcc_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
if
need_handle
=
=
True
:
branch_queue.put(jcc_target)
ea
+
=
get_item_size(ea)
new_code_ea
+
=
len
(new_code)
ea
=
new_code_start
while
ea < new_code_ea:
idc.create_insn(ea)
mnem
=
idc.print_insn_mnem(ea)
if
mnem
=
=
'call'
:
call_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
assert
need_handle
=
=
False
ida_bytes.patch_bytes(ea, asm(
'call 0x%x'
%
(call_target), ea))
elif
mnem[
0
]
=
=
'j'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jcc_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
assert
need_handle
=
=
False
ida_bytes.patch_bytes(ea, asm(
'%s 0x%x'
%
(mnem, jcc_target), ea).ljust(idc.get_item_size(ea), b
'\x90'
))
elif
mnem
=
=
'pushf'
:
ida_bytes.patch_bytes(ea, b
'\x90'
*
9
)
ea
+
=
9
continue
ea
+
=
get_item_size(ea)
ea
=
new_code_start
while
ea < new_code_ea:
idc.create_insn(ea)
mnem
=
idc.print_insn_mnem(ea)
if
mnem
=
=
'call'
:
call_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
assert
need_handle
=
=
False
ida_bytes.patch_bytes(ea, asm(
'call 0x%x'
%
(call_target), ea))
elif
mnem[
0
]
=
=
'j'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jcc_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
assert
need_handle
=
=
False
ida_bytes.patch_bytes(ea, asm(
'%s 0x%x'
%
(mnem, jcc_target), ea).ljust(idc.get_item_size(ea), b
'\x90'
))
elif
mnem
=
=
'pushf'
:
ida_bytes.patch_bytes(ea, b
'\x90'
*
9
)
ea
+
=
9
continue
ea
+
=
get_item_size(ea)
new_jmp_table
=
(
0xA6000
-
0x2D54
,
0xA6000
)
for
eax
in
range
(
0x20
):
jmp_target
=
(ida_bytes.get_dword(jmp_table[
0
]
+
eax
*
4
)
+
jmp_table[
1
]) &
0xFFFFFFFF
new_jmp_target, need_handle
=
reloc.get(jmp_target)
assert
need_handle
=
=
False
ida_bytes.patch_dword(new_jmp_table[
0
]
+
eax
*
4
, (new_jmp_target
-
new_jmp_table[
1
]) &
0xFFFFFFFF
)
need_patch_addr
=
0x963D7
ida_bytes.patch_bytes(need_patch_addr, asm(
'call 0x900;add ebx, 0x%x'
%
(new_jmp_table[
1
]
-
(need_patch_addr
+
5
)), need_patch_addr))
ida_bytes.patch_bytes(new_jmp_table[
1
]
-
0x2d7a
, ida_bytes.get_bytes(jmp_table[
1
]
-
0x2d7a
,
0x26
))
new_jmp_table
=
(
0xA6000
-
0x2D54
,
0xA6000
)
for
eax
in
range
(
0x20
):
jmp_target
=
(ida_bytes.get_dword(jmp_table[
0
]
+
eax
*
4
)
+
jmp_table[
1
]) &
0xFFFFFFFF
new_jmp_target, need_handle
=
reloc.get(jmp_target)
assert
need_handle
=
=
False
ida_bytes.patch_dword(new_jmp_table[
0
]
+
eax
*
4
, (new_jmp_target
-
new_jmp_table[
1
]) &
0xFFFFFFFF
)
need_patch_addr
=
0x963D7
ida_bytes.patch_bytes(need_patch_addr, asm(
'call 0x900;add ebx, 0x%x'
%
(new_jmp_table[
1
]
-
(need_patch_addr
+
5
)), need_patch_addr))
ida_bytes.patch_bytes(new_jmp_table[
1
]
-
0x2d7a
, ida_bytes.get_bytes(jmp_table[
1
]
-
0x2d7a
,
0x26
))
from
queue
import
*
import
ida_bytes
from
idc
import
*
import
idc
from
keystone
import
*
from
capstone
import
*
asmer
=
Ks(KS_ARCH_X86, KS_MODE_32)
disasmer
=
Cs(CS_ARCH_X86, CS_MODE_32)
def
disasm(machine_code, addr
=
0
):
l
=
""
for
i
in
disasmer.disasm(machine_code, addr):
l
+
=
"{:8s} {};\n"
.
format
(i.mnemonic, i.op_str)
return
l.strip(
'\n'
)
def
asm(asm_code, addr
=
0
):
l
=
b''
for
i
in
asmer.asm(asm_code, addr)[
0
]:
l
+
=
bytes([i])
return
l
def
print_asm(ea):
print
(disasm(idc.get_bytes(ea, idc.get_item_size(ea)), ea))
class
RelocDSU:
def
__init__(
self
):
self
.reloc
=
{}
def
get(
self
, ea):
if
ea
not
in
self
.reloc:
if
idc.print_insn_mnem(ea)
=
=
'jmp'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jmp_ea
=
idc.get_operand_value(ea,
0
)
if
idc.get_segm_name(jmp_ea)
=
=
'.got.plt'
:
self
.reloc[ea]
=
ea
return
self
.reloc[ea],
False
self
.reloc[ea], need_handle
=
self
.get(idc.get_operand_value(ea,
0
))
return
self
.reloc[ea], need_handle
else
:
self
.reloc[ea]
=
ea
if
self
.reloc[ea] !
=
ea:
self
.reloc[ea]
=
self
.get(
self
.reloc[ea])[
0
]
return
self
.reloc[ea], idc.get_segm_name(
self
.reloc[ea])
=
=
'.text'
def
merge(
self
, ea, reloc_ea):
self
.reloc[
self
.get(ea)[
0
]]
=
self
.get(reloc_ea)[
0
]
reloc
=
RelocDSU()
class
Block:
def
__init__(
self
, start_ea, end_ea, imm, reg, call_target):
self
.start_ea
=
start_ea
self
.end_ea
=
end_ea
self
.imm
=
imm
self
.reg
=
reg
self
.call_target
=
call_target
def
mov_code(ea, new_code_ea):
return
asm(disasm(idc.get_bytes(ea, idc.get_item_size(ea)), ea), new_code_ea)
def
get_real_code(block, new_code_ea):
ea
=
block.call_target
while
True
:
if
idc.print_insn_mnem(ea)
=
=
'cmp'
:
reg
=
idc.print_operand(ea,
0
)
imm
=
idc.get_operand_value(ea,
1
)
if
reg
=
=
block.reg
and
imm
=
=
block.imm:
ea
+
=
idc.get_item_size(ea)
break
ea
+
=
idc.get_item_size(ea)
assert
idc.print_insn_mnem(ea)
=
=
'jnz'
ea
+
=
idc.get_item_size(ea)
assert
idc.print_insn_mnem(ea)
=
=
'popa'
ea
+
=
idc.get_item_size(ea)
assert
idc.print_insn_mnem(ea)
=
=
'popf'
ea
+
=
idc.get_item_size(ea)
if
idc.print_insn_mnem(ea)
=
=
'pushf'
:
return
True
, asm(
'ret'
)
new_code
=
b''
while
True
:
if
idc.print_insn_mnem(ea)
=
=
'jmp'
:
jmp_ea
=
idc.get_operand_value(ea,
0
)
if
idc.print_insn_mnem(jmp_ea)
=
=
'pushf'
:
break
ea
=
jmp_ea
else
:
code
=
mov_code(ea, new_code_ea)
new_code
+
=
code
new_code_ea
+
=
len
(code)
ea
+
=
get_item_size(ea)
return
False
, new_code
def
get_block(start_ea):
global
imm, reg, call_target
mnem_list
=
[
'pushf'
,
'pusha'
,
'mov'
,
'call'
,
'pop'
]
ea
=
start_ea
for
i
in
range
(
5
):
mnem
=
idc.print_insn_mnem(ea)
assert
mnem
=
=
mnem_list[i]
if
mnem
=
=
'mov'
:
imm
=
idc.get_operand_value(ea,
1
)
reg
=
idc.print_operand(ea,
0
)
elif
mnem
=
=
'call'
:
call_target
=
idc.get_operand_value(ea,
0
)
ea
+
=
idc.get_item_size(ea)
return
Block(start_ea, ea, imm, reg, call_target)
def
handle_one_branch(branch_address, new_code_ea):
new_code
=
b''
ea
=
branch_address
while
True
:
try
:
block
=
get_block(ea)
is_ret, real_code
=
get_real_code(block, new_code_ea)
reloc.merge(ea, new_code_ea)
ea
=
block.end_ea
new_code_ea
+
=
len
(real_code)
new_code
+
=
real_code
if
is_ret:
break
except
:
get_eip_func
=
{
0x900
:
'ebx'
,
0x435c
:
'eax'
}
if
idc.print_insn_mnem(ea)
=
=
'call'
and
get_operand_value(ea,
0
)
in
get_eip_func:
reloc.merge(ea, new_code_ea)
real_code
=
asm(
'mov %s, 0x%x'
%
(get_eip_func[get_operand_value(ea,
0
)], ea
+
5
), new_code_ea)
else
:
if
idc.print_insn_mnem(ea)
=
=
'jmp'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
reloc.merge(new_code_ea, ea)
else
:
reloc.merge(ea, new_code_ea)
real_code
=
mov_code(ea, new_code_ea)
new_code
+
=
real_code
if
real_code
=
=
asm(
'ret'
):
break
new_code_ea
+
=
len
(real_code)
if
idc.print_insn_mnem(ea)
=
=
'jmp'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jmp_ea
=
idc.get_operand_value(ea,
0
)
if
reloc.get(jmp_ea)[
1
]
=
=
False
:
break
ea
=
reloc.get(jmp_ea)[
0
]
else
:
ea
+
=
get_item_size(ea)
return
new_code
def
solve():
entry_point
=
0x48F4
new_code_start
=
0x96150
new_code_ea
=
new_code_start
jmp_table
=
(
0x892ac
,
0x8c000
)
for
_
in
range
(
0x10000
): idc.del_items(new_code_ea
+
_)
ida_bytes.patch_bytes(new_code_ea,
0x10000
*
b
'\x90'
)
func_queue
=
Queue()
func_queue.put(entry_point)
while
not
func_queue.empty():
func_address
=
func_queue.get()
if
reloc.get(func_address)[
1
]
=
=
False
:
continue
reloc.merge(func_address, new_code_ea)
branch_queue
=
Queue()
branch_queue.put(func_address)
if
func_address
=
=
0x4148
:
assert
new_code_ea
=
=
0x963d0
for
eax
in
range
(
0x20
):
jmp_target
=
(ida_bytes.get_dword(jmp_table[
0
]
+
eax
*
4
)
+
jmp_table[
1
]) &
0xFFFFFFFF
new_jmp_target, need_handle
=
reloc.get(jmp_target)
if
need_handle: branch_queue.put(jmp_target)
while
not
branch_queue.empty():
branch_address
=
branch_queue.get()
new_code
=
handle_one_branch(branch_address, new_code_ea)
ida_bytes.patch_bytes(new_code_ea, new_code)
ea
=
new_code_ea
while
ea < new_code_ea
+
len
(new_code):
idc.create_insn(ea)
if
idc.print_insn_mnem(ea)
=
=
'call'
:
call_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
if
need_handle: func_queue.put(call_target)
elif
idc.print_insn_mnem(ea)[
0
]
=
=
'j'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jcc_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
if
need_handle
=
=
True
:
branch_queue.put(jcc_target)
ea
+
=
get_item_size(ea)
new_code_ea
+
=
len
(new_code)
ea
=
new_code_start
while
ea < new_code_ea:
idc.create_insn(ea)
mnem
=
idc.print_insn_mnem(ea)
if
mnem
=
=
'call'
:
call_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
assert
need_handle
=
=
False
ida_bytes.patch_bytes(ea, asm(
'call 0x%x'
%
(call_target), ea))
elif
mnem[
0
]
=
=
'j'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jcc_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
assert
need_handle
=
=
False
ida_bytes.patch_bytes(ea, asm(
'%s 0x%x'
%
(mnem, jcc_target), ea).ljust(idc.get_item_size(ea), b
'\x90'
))
elif
mnem
=
=
'pushf'
:
ida_bytes.patch_bytes(ea, b
'\x90'
*
9
)
ea
+
=
9
continue
ea
+
=
get_item_size(ea)
new_jmp_table
=
(
0xA6000
-
0x2D54
,
0xA6000
)
for
eax
in
range
(
0x20
):
jmp_target
=
(ida_bytes.get_dword(jmp_table[
0
]
+
eax
*
4
)
+
jmp_table[
1
]) &
0xFFFFFFFF
new_jmp_target, need_handle
=
reloc.get(jmp_target)
assert
need_handle
=
=
False
ida_bytes.patch_dword(new_jmp_table[
0
]
+
eax
*
4
, (new_jmp_target
-
new_jmp_table[
1
]) &
0xFFFFFFFF
)
need_patch_addr
=
0x963D7
ida_bytes.patch_bytes(need_patch_addr, asm(
'call 0x900;add ebx, 0x%x'
%
(new_jmp_table[
1
]
-
(need_patch_addr
+
5
)), need_patch_addr))
ida_bytes.patch_bytes(new_jmp_table[
1
]
-
0x2d7a
, ida_bytes.get_bytes(jmp_table[
1
]
-
0x2d7a
,
0x26
))
for
_
in
range
(
0x10000
): idc.del_items(new_code_ea
+
_)
idc.jumpto(new_code_start)
ida_funcs.add_func(new_code_start)
print
(
"finish"
)
solve()
from
queue
import
*
import
ida_bytes
from
idc
import
*
import
idc
from
keystone
import
*
from
capstone
import
*
asmer
=
Ks(KS_ARCH_X86, KS_MODE_32)
disasmer
=
Cs(CS_ARCH_X86, CS_MODE_32)
def
disasm(machine_code, addr
=
0
):
l
=
""
for
i
in
disasmer.disasm(machine_code, addr):
l
+
=
"{:8s} {};\n"
.
format
(i.mnemonic, i.op_str)
return
l.strip(
'\n'
)
def
asm(asm_code, addr
=
0
):
l
=
b''
for
i
in
asmer.asm(asm_code, addr)[
0
]:
l
+
=
bytes([i])
return
l
def
print_asm(ea):
print
(disasm(idc.get_bytes(ea, idc.get_item_size(ea)), ea))
class
RelocDSU:
def
__init__(
self
):
self
.reloc
=
{}
def
get(
self
, ea):
if
ea
not
in
self
.reloc:
if
idc.print_insn_mnem(ea)
=
=
'jmp'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jmp_ea
=
idc.get_operand_value(ea,
0
)
if
idc.get_segm_name(jmp_ea)
=
=
'.got.plt'
:
self
.reloc[ea]
=
ea
return
self
.reloc[ea],
False
self
.reloc[ea], need_handle
=
self
.get(idc.get_operand_value(ea,
0
))
return
self
.reloc[ea], need_handle
else
:
self
.reloc[ea]
=
ea
if
self
.reloc[ea] !
=
ea:
self
.reloc[ea]
=
self
.get(
self
.reloc[ea])[
0
]
return
self
.reloc[ea], idc.get_segm_name(
self
.reloc[ea])
=
=
'.text'
def
merge(
self
, ea, reloc_ea):
self
.reloc[
self
.get(ea)[
0
]]
=
self
.get(reloc_ea)[
0
]
reloc
=
RelocDSU()
class
Block:
def
__init__(
self
, start_ea, end_ea, imm, reg, call_target):
self
.start_ea
=
start_ea
self
.end_ea
=
end_ea
self
.imm
=
imm
self
.reg
=
reg
self
.call_target
=
call_target
def
mov_code(ea, new_code_ea):
return
asm(disasm(idc.get_bytes(ea, idc.get_item_size(ea)), ea), new_code_ea)
def
get_real_code(block, new_code_ea):
ea
=
block.call_target
while
True
:
if
idc.print_insn_mnem(ea)
=
=
'cmp'
:
reg
=
idc.print_operand(ea,
0
)
imm
=
idc.get_operand_value(ea,
1
)
if
reg
=
=
block.reg
and
imm
=
=
block.imm:
ea
+
=
idc.get_item_size(ea)
break
ea
+
=
idc.get_item_size(ea)
assert
idc.print_insn_mnem(ea)
=
=
'jnz'
ea
+
=
idc.get_item_size(ea)
assert
idc.print_insn_mnem(ea)
=
=
'popa'
ea
+
=
idc.get_item_size(ea)
assert
idc.print_insn_mnem(ea)
=
=
'popf'
ea
+
=
idc.get_item_size(ea)
if
idc.print_insn_mnem(ea)
=
=
'pushf'
:
return
True
, asm(
'ret'
)
new_code
=
b''
while
True
:
if
idc.print_insn_mnem(ea)
=
=
'jmp'
:
jmp_ea
=
idc.get_operand_value(ea,
0
)
if
idc.print_insn_mnem(jmp_ea)
=
=
'pushf'
:
break
ea
=
jmp_ea
else
:
code
=
mov_code(ea, new_code_ea)
new_code
+
=
code
new_code_ea
+
=
len
(code)
ea
+
=
get_item_size(ea)
return
False
, new_code
def
get_block(start_ea):
global
imm, reg, call_target
mnem_list
=
[
'pushf'
,
'pusha'
,
'mov'
,
'call'
,
'pop'
]
ea
=
start_ea
for
i
in
range
(
5
):
mnem
=
idc.print_insn_mnem(ea)
assert
mnem
=
=
mnem_list[i]
if
mnem
=
=
'mov'
:
imm
=
idc.get_operand_value(ea,
1
)
reg
=
idc.print_operand(ea,
0
)
elif
mnem
=
=
'call'
:
call_target
=
idc.get_operand_value(ea,
0
)
ea
+
=
idc.get_item_size(ea)
return
Block(start_ea, ea, imm, reg, call_target)
def
handle_one_branch(branch_address, new_code_ea):
new_code
=
b''
ea
=
branch_address
while
True
:
try
:
block
=
get_block(ea)
is_ret, real_code
=
get_real_code(block, new_code_ea)
reloc.merge(ea, new_code_ea)
ea
=
block.end_ea
new_code_ea
+
=
len
(real_code)
new_code
+
=
real_code
if
is_ret:
break
except
:
get_eip_func
=
{
0x900
:
'ebx'
,
0x435c
:
'eax'
}
if
idc.print_insn_mnem(ea)
=
=
'call'
and
get_operand_value(ea,
0
)
in
get_eip_func:
reloc.merge(ea, new_code_ea)
real_code
=
asm(
'mov %s, 0x%x'
%
(get_eip_func[get_operand_value(ea,
0
)], ea
+
5
), new_code_ea)
else
:
if
idc.print_insn_mnem(ea)
=
=
'jmp'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
reloc.merge(new_code_ea, ea)
else
:
reloc.merge(ea, new_code_ea)
real_code
=
mov_code(ea, new_code_ea)
new_code
+
=
real_code
if
real_code
=
=
asm(
'ret'
):
break
new_code_ea
+
=
len
(real_code)
if
idc.print_insn_mnem(ea)
=
=
'jmp'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jmp_ea
=
idc.get_operand_value(ea,
0
)
if
reloc.get(jmp_ea)[
1
]
=
=
False
:
break
ea
=
reloc.get(jmp_ea)[
0
]
else
:
ea
+
=
get_item_size(ea)
return
new_code
def
solve():
entry_point
=
0x48F4
new_code_start
=
0x96150
new_code_ea
=
new_code_start
jmp_table
=
(
0x892ac
,
0x8c000
)
for
_
in
range
(
0x10000
): idc.del_items(new_code_ea
+
_)
ida_bytes.patch_bytes(new_code_ea,
0x10000
*
b
'\x90'
)
func_queue
=
Queue()
func_queue.put(entry_point)
while
not
func_queue.empty():
func_address
=
func_queue.get()
if
reloc.get(func_address)[
1
]
=
=
False
:
continue
reloc.merge(func_address, new_code_ea)
branch_queue
=
Queue()
branch_queue.put(func_address)
if
func_address
=
=
0x4148
:
assert
new_code_ea
=
=
0x963d0
for
eax
in
range
(
0x20
):
jmp_target
=
(ida_bytes.get_dword(jmp_table[
0
]
+
eax
*
4
)
+
jmp_table[
1
]) &
0xFFFFFFFF
new_jmp_target, need_handle
=
reloc.get(jmp_target)
if
need_handle: branch_queue.put(jmp_target)
while
not
branch_queue.empty():
branch_address
=
branch_queue.get()
new_code
=
handle_one_branch(branch_address, new_code_ea)
ida_bytes.patch_bytes(new_code_ea, new_code)
ea
=
new_code_ea
while
ea < new_code_ea
+
len
(new_code):
idc.create_insn(ea)
if
idc.print_insn_mnem(ea)
=
=
'call'
:
call_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
if
need_handle: func_queue.put(call_target)
elif
idc.print_insn_mnem(ea)[
0
]
=
=
'j'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jcc_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
if
need_handle
=
=
True
:
branch_queue.put(jcc_target)
ea
+
=
get_item_size(ea)
new_code_ea
+
=
len
(new_code)
ea
=
new_code_start
while
ea < new_code_ea:
idc.create_insn(ea)
mnem
=
idc.print_insn_mnem(ea)
if
mnem
=
=
'call'
:
call_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
assert
need_handle
=
=
False
ida_bytes.patch_bytes(ea, asm(
'call 0x%x'
%
(call_target), ea))
elif
mnem[
0
]
=
=
'j'
and
idc.get_operand_type(ea,
0
) !
=
idc.o_reg:
jcc_target, need_handle
=
reloc.get(get_operand_value(ea,
0
))
assert
need_handle
=
=
False
ida_bytes.patch_bytes(ea, asm(
'%s 0x%x'
%
(mnem, jcc_target), ea).ljust(idc.get_item_size(ea), b
'\x90'
))
elif
mnem
=
=
'pushf'
:
ida_bytes.patch_bytes(ea, b
'\x90'
*
9
)
ea
+
=
9
continue
ea
+
=
get_item_size(ea)
new_jmp_table
=
(
0xA6000
-
0x2D54
,
0xA6000
)
for
eax
in
range
(
0x20
):
jmp_target
=
(ida_bytes.get_dword(jmp_table[
0
]
+
eax
*
4
)
+
jmp_table[
1
]) &
0xFFFFFFFF
new_jmp_target, need_handle
=
reloc.get(jmp_target)
[招生]科锐逆向工程师培训(2024年11月15日实地,远程教学同时开班, 第51期)
最后于 2023-8-29 19:13
被sky_123编辑
,原因: