腾讯2010第二阶段第一题分析
发表于:
2010-10-31 12:25
5432
简单说下虚拟机分析过程,0x40开始的ConText无法确认到底是什么寄存器,先暂时用Rx代替。
因为后面访问十分频繁。
经过分析得知操作码类型就只有这几种,也就是Handler.
const char * vm_handler[] =
{
"enter/leave",//0
"pushfd",//1
"popfd",//2
"mov",//3
"mov",//4
"mov",//5
"mov",//6
"mov",//7
"mov",//8
"add",//9
"sub",//a
"mul",//b
"mod",//c
"test eflags",//d
"jcc",//e
"add",//f
"sub",//10
"mul",//11
"div",//12
"test",//13
"and",//14
"xor",//15
"or",//16
"not",//17
"shr",//18
"sar",//19
"shl",//1a
"shl",//1b
"nop",//1c
"nop"//1d
};
每条指令长度为16字节,结构如下:
typedef struct _vm_instruction
{
unsigned short opcode;
union
{
char v1;
unsigned short type;
}type;
int dest_reg;
unsigned int src_reg;
unsigned int unknow;
}vm_instruction,*pvm_instruction;
下面是寄存器定义
const char * reg32[] =
{
"eflags",
"edi",
"esi",
"ebp",
"esp",
"ebx",
"edx",
"ecx",
"eax",
};
const char * op_type[] =
{
"dword ptr",
"word ptr",
"byte ptr"
};
const char * vm_reg32[] =
{
"R0",
"R1",
"R2",
"R3",
"R4",
"R5",
"R6",
"R7",
"vm_eflags"
};
到最后你可以发现R0实际上是EDI,R1是ESI,只不过都是临时寄存器,如果那样直接翻译会和真实的寄存器混淆。
这样最后可以正确解析。
例如:
40cde0: mov R0,esp
40cdf0: mov R1,fffffffch
40ce00: add R0,R1
40ce10: mov esp,R0
40ce20: mov R1,esi
40ce30: mov dword ptr [R0],R1
40ce40: mov R0,edx
40ce50: mov R1,edx
40ce60: xor R0,R1
40ce70: popfd
上面是解析后的指令,可以手动还原为
add esp,-4
mov dword ptr [esp],esi
xor edx,edx
这里popfd实际上是为了保护标志寄存器。
一条x86指令被拆分成几条VM指令,后面基本一样,省略了,具体见附件。
[培训]内核驱动高级班,冲击BAT一流互联网大厂工作,每周日13:00-18:00直播授课
上传的附件: