原crackme的帖子地址:http://bbs.pediy.com/showthread.php?t=46308
OD载入,ultra string reference 字符串参考,发现“错误”“注册失败”等等关键信息
004019C5 E8 D6F8FFFF call 004012A0 //关键call
004019CA 85C0 test eax, eax
004019CC 75 15 jnz short 004019E3
004019CE 50 push eax
004019CF 68 A0B24000 push 0040B2A0 ; |错误
004019D4 68 A8B24000 push 0040B2A8
004019D9 50 push eax
004019DA FF15 08B14000 call dword ptr [<&USER32.MessageBoxA>>;
Call 004012A0 的返回值决定了程序的走向;不过没看到注册成功的信息
进入这个call 就发现了VM
004012A0 $ 68 FC834100 push 004183FC//这个地址指向VM机器码
004012A5 .- E9 52010100 jmp 004113FC//跳到VM解释代码处
004113FC 50 push eax
004113FD 53 push ebx
004113FE 51 push ecx
004113FF 52 push edx
00411400 56 push esi
00411401 57 push edi
00411402 55 push ebp
00411403 9C pushfd
00411404 8B7424 20 mov esi, dword ptr [esp+20]
//esi存放刚才的004183FC所以相当于VM_EIP
00411408 8BEC mov ebp, esp
0041140A 81EC 00020000 sub esp, 200
00411410 8BFC mov edi, esp
00411412 83EC 40 sub esp, 40
00411415 8BDE mov ebx, esi
00411417 0FB606 movzx eax, byte ptr [esi]
//取op,op的长度是1byte
0041141A 8D76 01 lea esi, dword ptr [esi+1]
0041141D FF2485 00104100 jmp dword ptr [eax*4+411000]
//411000处存放着每个op对应的处理地址,该地址处代码解释op的动作和含义
00411424 8D87 00010000 lea eax, dword ptr [edi+100]
0041142A 3BC5 cmp eax, ebp //防止溢出
0041142C ^ 7C E7 jl short 00411415
VM是一种变形,通过VM解释例程将VM机器码直接处理成X86汇编,而不是VM汇编,N条X86汇编语句表示一条VM汇编语句,大大增加了分析难度
我的目标是将VM机器码翻译成VM汇编,然后再映射回去,保持语义不变:
411000:处数据如下
49 17 41 00 CC CC CC CC CC CC CC CC CC CC CC CC CC CC CC CC 9A 15 41 00 4E 1B 41 00 42 16 41 00 CC CC CC CC CC CC CC CC 35 1B 41 00 BA 1B 41 00…………
处理一下得到所有的op以及对应的解释地址
#include <stdio.h>
#include <windows.h>
DWORD optable[]={0x00411749,0xCCCCCCCC,0xCCCCCCCC,0xCCCCCCCC,0xCCCCCCCC,0x0041159A,0x00411B4E,0x00411642,0xCCCCCCCC,0xCCCCCCCC,0x00411B35,0x00411BBA,0xCCCCCCCC,0x00411538,0xCCCCCCCC,0x00411865,0xCCCCCCCC,0xCCCCCCCC,0xCCCCCCCC,0xCCCCCCCC,0xCCCCCCCC,0xCCCCCCCC,0x00411AE3,0xCCCCCCCC,0xCCCCCCCC,0x00411B90,0xCCCCCCCC,0xCCCCCCCC,0xCCCCCCCC,0xCCCCCCCC,0xCCCCCCCC,0x00411C60,0x00411C37,0x00411A26,0x00411E2B,0xCCCCCCCC,0xCCCCCCCC,0x00411A0D,0x00411D30,0x00411A78,0xCCCCCCCC,0xCCCCCCCC,0x004115DF,0xCCCCCCCC,0xCCCCCCCC,0xCCCCCCCC,……};
int main(int argc, char* argv[])
{
int i=0,num=0;
for(i=0;optable[i]!=0;i++)
{
if(optable[i]!=0xCCCCCCCC)
{printf("%x %x\n",i,optable[i]);num++;}
}
printf("%d\n",num);
}
00 411749
05 41159a
06 411b4e
07 411642
0a 411b35
0b 411bba
0d 411538
0f 411865
16 411ae3
19 411b90
1f 411c60
………不一一列出了,一共有62个op和解释地址
再来看一下VM机器码是什么样子的,到004183FC处看一下:
6B CA D4 00 00 00 8A 2C 00 00 00 07 AC 2C 00 00 00 C1 82 CA 08 D0 40 00 CA 01 00 00 00 CA FF FF
FF FF CA FF FF FF FF 83 8A 14 00 00 00 96 AC 14 00 00 00 C1 E3 8A 2C 00 00 00 8A 14 00 00 00 76
AC 14 00 00 00 C1 E3 8A 14 00 00 00 CA D0 00 00 00 CA 01 00 00 00 CA FF FF FF FF CA 2C 00 00 00
83 AD C1 C1 E3……..
想办法把它转化成顺眼一点的格式:
分析各个op的动作将它粗略分成两类:
1类是op+arg(dword)型:CA:D4 00 00 00;8A:2C 00 00 00;AC 2C 00 00 00…
这类op不多,有String[] op1="0D","8A","AC","CA","A1","AF","0F"};
其他的都分到第二类;
根据这个分类可以将VM机器码转化成我们相对熟悉的样子:
4183fc: 6B
4183fd: CA 000000D4
418402: 8A 0000002C
418407: 07
418408: AC 0000002C
41840d: C1
41840e: 82
41840f: CA 0040D008
418414: CA 00000001
418419: CA FFFFFFFF
41841e: CA FFFFFFFF
418423: 83
418424: 8A 00000014
418429: 96
41842a: AC 00000014
41842f: C1
418430: E3
这样是不是熟悉多了
这部分变化我使用正则表达式处理,java写的…
继续转化:
下面就要分析各个op具体的语义了,举几个例子
1、6b 411497:
00411497 8B85 00000000 mov eax, dword ptr [ebp]
0041149D 8987 10000000 mov dword ptr [edi+10], eax ; edi+10::VM_EFL
004114A3 81C5 04000000 add ebp, 4
004114A9 8B85 00000000 mov eax, dword ptr [ebp]
004114AF 8987 34000000 mov dword ptr [edi+34], eax ; edi+34::VM_EBP
004114B5 81C5 04000000 add ebp, 4
004114BB 8B85 00000000 mov eax, dword ptr [ebp]
004114C1 8987 28000000 mov dword ptr [edi+28], eax ; edi+28::VM_edi
004114C7 81C5 04000000 add ebp, 4
004114CD 8B85 00000000 mov eax, dword ptr [ebp]
004114D3 8987 08000000 mov dword ptr [edi+8], eax ; edi+8::VM_esi
004114D9 81C5 04000000 add ebp, 4
004114DF 8B85 00000000 mov eax, dword ptr [ebp]
004114E5 8987 0C000000 mov dword ptr [edi+C], eax ; edi+c::VM_edx
004114EB 81C5 04000000 add ebp, 4
004114F1 8B85 00000000 mov eax, dword ptr [ebp]
004114F7 8987 30000000 mov dword ptr [edi+30], eax ; edi+30::VM_ecx
004114FD 81C5 04000000 add ebp, 4
00411503 8B85 00000000 mov eax, dword ptr [ebp]
00411509 8987 1C000000 mov dword ptr [edi+1C], eax ; edi+1c::VM_ebx
0041150F 81C5 04000000 add ebp, 4
00411515 8B85 00000000 mov eax, dword ptr [ebp]
0041151B 8987 14000000 mov dword ptr [edi+14], eax ; edi+14::VM_eax
00411521 81C5 04000000 add ebp, 4
00411527 81C5 04000000 add ebp, 4
0041152D 89AF 2C000000 mov dword ptr [edi+2C], ebp ; edi+2c::VM_esp
00411533 ^ E9 ECFEFFFF jmp 00411424
根据对应关系(VM_reg保存X86_reg)定义各个VM_reg
2、8a 411447
00411447 8B06 mov eax, dword ptr [esi]
00411449 83C6 04 add esi, 4
0041144C 8B0407 mov eax, dword ptr [edi+eax]
0041144F 50 push eax
00411450 ^ EB D2 jmp short 00411424
Push VM_reg;
具体是那个reg由8A 所带的操作数arg决定,即reg=f(arg);
这个函数f就是1中的对应关系:f(2c)=VM_ESP
3、ca 411452
00411452 8B06 mov eax, dword ptr [esi]
00411454 83C6 04 add esi, 4
00411457 50 push eax
00411458 ^ EB CA jmp short 00411424
Push arg
4. 83 41145a
0041145A BA 00000000 mov edx, 0
0041145F B9 00000000 mov ecx, 0
00411464 8B0424 mov eax, dword ptr [esp]
00411467 85C0 test eax, eax
00411469 0F4D1407 cmovge edx, dword ptr [edi+eax]
0041146D 8B4424 04 mov eax, dword ptr [esp+4]
00411471 85C0 test eax, eax
00411473 0F4D0C07 cmovge ecx, dword ptr [edi+eax]
00411477 0FAF4C24 08 imul ecx, dword ptr [esp+8]
0041147C 034C24 0C add ecx, dword ptr [esp+C]
00411480 03D1 add edx, ecx
00411482 83C4 10 add esp, 10
00411485 52 push edx
Add esp,10
Push g([esp],[esp+4],[esp+8],[esp+c])//==f([esp])+f([esp+4])*[esp+8]+[esp+c]
例如:push (VM_ESI+VM_ESI*1+0)
5、AC 411488
00411488 8B06 mov eax, dword ptr [esi]
0041148A 83C6 04 add esi, 4
0041148D 8F0407 pop dword ptr [edi+eax]
00411490 ^ EB 92 jmp short 00411424
Pop VM_reg
6、C1 411492
00411492 83C4 04 add esp, 4
00411495 ^ EB 8D jmp short 00411424
add esp, 4
7、E3 411637
00411637 89AF 2C000000 mov dword ptr [edi+2C], ebp
0041163D ^ E9 E2FDFFFF jmp 00411424
Mov VM_ESP,V_ESP
Ebp和[edi+2c]都指向VM的栈顶,两者相互保存,相互更新, 82是E3的反向
……..
其他限于篇幅。
总之:用当前堆栈做为临时数据区传递数据,用C1来平衡当前堆栈,当前堆栈只是充当了一个红娘的角色,最后可以消掉。(2E::VM_PUSH等几个op会造成当前堆栈不平衡,不知道是不是故意的)
41840f: CA 0040D008
418414: CA 00000001
418419: CA FFFFFFFF
41841e: CA FFFFFFFF
418423: 83
注意到83这个op很特殊,总是以上形式出现,唯一一个,先处理成:
41840f: 83 0040D008 00000001 FFFFFFFF FFFFFFFF
Push g(arg1,arg2,arg3,arg4)
这里 是 push 0040D008
这样的话,得到如下结果:
4183fc: 6B
4183fd: 000000D4
418402: vm_esp
418407: 07
418408: vm_esp
41840d: C1
41840e: 82
41840f: 0+0040D008
418424: vm_eax
418429: 96
41842a: vm_eax
41842f: C1
418430: E3
418431: vm_esp
418436: vm_eax
41843b: 76
41843c: vm_eax
418441: C1
418442: E3
418443: vm_eax
418448: vm_esp+000000D0
41845d: AD
41845e: C1
41845f: C1
418460: E3
………………
这样 vm_op和VM_arg都出来了,只要把vm_op的含义代入,消去esp,进一步转化成简单的形式。
仔细观察可以发现,夹在E3|82中间的代码形成一条VM汇编代码:
VM_arg在VM_op上面,根据所带 2,1,0个操作数VM_arg,将VM_op分类
418442: E3
418443: vm_eax //VM_arg
418448: vm_esp+000000D0 //VM_arg
41845d: AD //VM_op
41845e: C1 //平衡当前堆栈
41845f: C1 //平衡当前堆栈
418460: E3 //更新VM_ESP
就是V_mov [VM_esp+000000D0],VM_eax
……不一一列举
E3|82形成了界线,其实也是当前堆栈平衡的时候,有push就有pop或者add esp,4;
最后映射一下,去掉VM_前缀:
4183fc: enter
4183fd: sub esp,000000D4
41840f: mov eax,[0+0040D008]
418431: xor eax,esp
418443: mov [esp+000000D0],eax
418461: xor eax,eax
418473: push esi
418480: mov esi,[esp+000000DC]
4184a2: push 00000027
4184ab: push eax
4184b8: mov [esp+00000045],eax
4184d6: mov [esp+00000049],eax
4184f4: mov [esp+0000004D],eax
418512: mov [esp+00000051],eax
418530: mov [esp+00000055],eax
41854e: mov [esp+00000059],eax
41856c: mov [esp+0000005D],eax
41858a: mov [esp+00000061],ax
4185a8: mov [esp+00000063],al
4185c6: mov [esp+00000064],al
4185e4: lea eax,[esp+00000065]
418606: push eax
418613: mov [esp+00000048],00000000
418631: push 004153FC exit_to 00401C50
…………………
X86的reg映射成VM的reg,借助x86当前堆栈,操作处理VM_reg,最后将VM的reg映射回X86的reg,所以这个VM的机器码能完全还原成对应的X86机器码。
总结:整个过程实际是VM机器码—->VM,X86混合汇编—->VM汇编—->映射回X86
完整的还原代码见附件:
[注意]传递专业知识、拓宽行业人脉——看雪讲师团队等你加入!