|
调试P-CODE有什么好工具没有?
VB P-code 是不是有通用转换器,能把普通VB程序源代码直接转换成P-CODE程序? |
|
调试P-CODE有什么好工具没有?
With a user base of around five million, Visual Basic (VB) is probably the most widely used programming language in the world. Though not very suitable for 'hard-core' professionals, VB makes programming very simple and this is why it is so popular with beginners. Not surprisingly, we see a lot of worms, backdoors and other kinds of malware written in Visual Basic. VB versions 5.0 and 6.0 are able to create two types of executable: p-code and native code (both require VB runtime DLL to execute). The majority of projects written in VB (including malware) are compiled to native code simply because it is the default option. However, it is very simple to change it to p-code, so there is good reason to study the internals of such executables. Understanding pcode will also help us to gain a better understanding of native code. An article by Andy Nikishin and Mike Pavlyushchik, in the January 2002 issue of Virus Bulletin (see VB, January 2002, p.6) presents a good introduction to the internal structures of VB executables, especially those compiled to native code. In this article, executables compiled to p-code will be examined more closely. p-code, or pseudo code, is a set of stack-oriented CPUindependent instructions, an intermediate step between the high-level statements in Basic program and the low-level native code instructions executed by a computer's processor. It can be interpreted either by VB virtual machine (in case the project is compiled to p-code) or converted to a native code and optimized (in case the project is compiled to native code). It has instructions for loading, storing, initializing, object method calling, many instructions for arithmetic and logical operations, control flow and so on. To avoid any misunderstandings in terminology, it should be noted that the p-code generated by Visual Basic for Application (VBA) differs from that generated by VB. While the p-code of VBA is basically the pre-processed source code stored in a special ready-to-interpret form, p-code of VB is a result of true compilation. It is, therefore, comparable to MSIL (Microsoft Intermediate Language) used by Microsoft's .NET Framework. The internal structures of VB executables are not documented. Fortunately, however, debug information for VB runtime DLL is available. This way we can learn the names of p-code instructions and, generally, it makes the analysis easier. To see how things really work we will first examine a simple 'Hello World!' project with one module and the following source code: Sub Main() MsgBox "Hello World!" End Sub We set 'Compile to P-Code' in project properties and compile it. After the PE executable is built we find a wellknown stub at the entry point: 00401038 push offset EXEPROJECTINFO 0040103D call MSVBVM60:ThunRTMain The ThunRTMain function exported by VB runtime DLL reads the EXEPROJECTINFO structure and starts VB project initialization. The structure contains a lot of important information, such as the address of the Main() function. The VB runtime will eventually call this address (if it is defined). Examining the code there we see: 004011CC mov edx,00040175C 004011D1 mov ecx,000401032 004011D6 jmp ecx ... 00401032 jmp MSVBVM60:ProcCallEngine The code loads the address of a structure to edx and calls the p-code interpretation engine (ProcCallEngine) in the VB runtime library. The structure in edx (let us call it ProcDescription) contains all the important information about the Main() function, including the address of the corresponding module description structure, the size of local variables and the size of all p-code instructions for this function. The ProcCallEngine loads various internal data, sets the error handler, allocates sufficient space on the stack and then starts processing the p-code instructions: ProcCallEngine: ... 66104abe mov ebx,edx ... 66104b7f movzx esi,word ptr [ebx+08] ; p-code size 66104b83 neg esi 66104b85 add esi,ebx ... 66104b8d xor eax,eax 66104b8f mov al,[esi] 66104b91 inc esi 66104b92 jmp dword ptr [tblDispatch+eax*4] From the code above we see the p-code instructions precede the ProcDescription structure directly. The first byte of every instruction encodes the instruction type. This leads to 256 different encodings, though not all of these are used. Furthermore, opcodes 0xfb - 0xff, named Lead0 - Lead4, have a second opcode byte and thus constitute a new set of 256 encodings each. Consequently, there are hundreds of different p-code instructions. The opcode bytes may be followed by one or more arguments. The majority of the instructions have a fixed number of arguments but there are a few exceptions. Knowing all that, we can inspect the p-code for our example Main() function itself: 0000 27FCFE LitVar_Missing loc_0104 ; Context 0003 271CFF LitVar_Missing loc_00E4 ; HelpFile 0006 273CFF LitVar_Missing loc_00C4 ; Title 0009 F500000000 LitI4 00000000 ; Buttons 000E 3A6CFF0000 LitVarStr loc_0094, "Hello World!" 0013 4E5CFF FStVarCopyObj loc_00A4 0016 045CFF FLdRfVar loc_00A4 ; Prompt 0019 0A01001400 ImpAdCallFPR4 rtcMsgBox,14 001E 3608005CFF3CFF1C FFreeVar loc_00A4,loc_00C4,loc_00E4,loc_0104 0029 14 ExitProc Let us explain this code briefly. We already mentioned that the VB p-code instructions are stack-oriented. The operation of an instruction can be derived from its name, but often it is necessary to inspect the actual code in the runtime DLL that interprets it. The name of the first instruction, LitVar_Missing, can be divided into three parts. 'Lit' states that a value will be pushed onto the top of the stack. 'Var', which stands for 'Variant', not 'variable', tells us that the instruction works with a variable of the 'Variant' type (Variant is a special data type that can contain any standard kind of data). Finally, 'Missing' specifies that the variable will be loaded with a special DISP_E_PARAMNOTFOUND value. The instruction takes one argument which is the offset of a local variable from the top of the stack. It fills it with the value and pushes its address onto the stack (the Variant type occupies more than four bytes in memory and therefore is not pushed directly onto the stack; instead a reference to it is added). LitI4 takes the I4 (Long) argument and, as expected, pushes it onto the stack also. The second argument of the LitVarStr instruction is an index to a special table (ReferenceTable) which contains the addresses of various data such as strings, imported functions, COM objects and so on. Every module in VB project has its own ReferenceTable. In the case of LitVarStr the corresponding address in this table points to a string in BSTR format (Unicode zero-terminated string preceded by its length). The string is contained in the read-only '.text' section of the PE executable, so a copy of it is created first by FStVarCopyObj. A reference to the copy is then pushed onto the stack by the FLdRfVar instruction as the first (Prompt) parameter of the MsgBox() function. We did not specify the remaining four parameters (Buttons, Title, HelpFile and Context), so they were generated automatically. The last three are loaded by LitVar_Missing and the Buttons parameter is vbOKOnly (defined as 0) by default. The ImpAdCallFPR4 instruction takes two arguments. The first is an index to ReferenceTable, the second is the size of parameters on stack. The instruction takes an address from ReferenceTable and sends a call there. The code at this address follows: 00401020 jmp MSVBVM60:rtcMsgBox As can be seen, the code simply jumps to the routine statically imported from VB virtual machine DLL. After rtcMsgBox returns, the FFreeVar instruction checks all the Variant variables used and frees any allocated resources (in our example the copy of the 'Hello World!' string). Finally, ExitProc does all the necessary de-initializations, removes the error handler, restores the stack (removes all possible arguments) and returns to the caller of ProcCallEngine. The example above has shown that, from the VB runtime point of view, there is not a big difference between a project compiled to native code and p-code. It can call p-code functions just like native code and the small chunks of code do all the necessary work to ensure the p-code will be interpreted correctly. This example was very simple. Most VB projects have no Main() function; instead they have a starting form. Finding the entry points of the starting form events (for example 'Load()') is more complicated and requires knowledge of several data structures which can be traced from the EXEPROJECTINFO structure. Applications written in VB are able to call external procedures in DLLs. These can be imported either statically (during load time via the PE import section) or dynamically (during run time). Usually the procedures of the VB runtime library are imported statically. This includes procedures like ThunRTMain or ProcCallEngine, which are referenced automatically by VB compiler, as well as a special set of functions available to the VB programmer. They can be used in code without explicit declaration. The names of these functions have the 'rtc' prefix (the prefix is added by the compiler, so it does not appear in the source code). Some of them are interesting in terms of generic malware detection - rtcCreateObject, rtcFileCopy and so on. The way these functions are called was illustrated in the previous example. Of more interest are calls to external API functions declared using the 'Declare' statement. The information about such functions is stored in a special table (ImportTable). Each entry contains the name of the function and DLL together with a short piece of native code that invokes the function. The following example illustrates the use of external functions. A similar code can be found in many backdoors: Public Declare Function GetCurrentProcessId _ Lib "kernel32" () As Long Public Declare Function RegisterServiceProcess _ Lib "kernel32" (ByVal dwProcessID As Long, _ ByVal dwType As Long) As Long Public Sub MakeMeService() Dim pid As Long Dim regserv As Long On Error Resume Next pid = GetCurrentProcessId() regserv = RegisterServiceProcess(pid, 1) End Sub This source code compiles to: 0000 0002 LargeBos 0002 0002 0005 LargeBos 0007 0004 4BFFFF OnErrorGoto Resume Next 0007 0011 LargeBos 0018 0009 5E00000000 ImpAdCallI4 kernel32:GetCurrentProcessId,00 000E 7170FF FStI4 loc_0090 0011 3C SetLastSystemError 0012 6C70FF ILdRf loc_0090 0015 7178FF FStI4 loc_0088 0018 0019 LargeBos 0031 001A F501000000 LitI4 00000001 001F 6C78FF ILdRf loc_0088 0022 5E01000800 ImpAdCallI4 kernel32:RegisterServiceProcess,08 0027 7170FF FStI4 loc_0090 002A 3C SetLastSystemError 002B 6C70FF ILdRf loc_0090 002E 7174FF FStI4 loc_008C 0031 0000 LargeBos 0031 0033 14 ExitProc Using the 'On Error Resume Next' statement is a little trick. It forces VB compiler to put the LargeBos instruction in front of the compilation of every statement, so we can easily see how a particular statement is compiled. The loc_0088 is the pid variable and loc_008C is regserv. The way both API functions are called is very similar to the way the rtcMsgBox function was called in the previous example (although, now, both functions return Long value so it is reflected in the mnemonic code of the ImpAdCallI4 instruction). However, the associated code (remember its address is in ReferenceTable) is different: 0040134C mov eax,[004022E4] 00401351 or eax,eax 00401353 je 00401357 00401355 jmp eax 00401357 push 00401334 0040135C mov eax,00401020 00401361 call eax 00401363 jmp eax ... 00401020 jmp MSVBVM60:DllFunctionCall First the code checks whether the address of the imported function has been located already. If it has, it simply jumps there, otherwise the DllFunctionCall function is invoked with a pointer to the entry of ImportTable as an argument. DllFunctionCall reads the name of the imported function and DLL from there and tries to get the address using LoadLibraryA and GetProcAddress API. If successful it will save this address (to variable at 4022E4 in this case) and return it. If not, it invokes the error handler internally. One of the reasons why VB is so popular must be the fact that it makes programming with COM objects so simple. Often a program that would take a lot of coding (and perhaps cause a lot of headaches) in languages like C++ can be written in a few lines of VB. Of course we pay a price for the comfort but sometimes it is very elegant. In contrast to the VBA or VBScript members of the Visual Basic family, VB supports late binding as well as both types of early binding - vtable and DispID bindings. While there is usually no need to use pricey late binding in VB, source codes of the infamous Melissa (VBA) and LoveLetter (VBScript) worms are so popular that the majority of VB email worms we see use this kind of binding. It means that no type library information about the COM object used is available during compilation so this must be collected during run time. The example will show the way a few lines of code in VB are compiled to p-code. They use MS Outlook COM object to create an email. As we have selected no additional references in the VB project settings, nor have we explicitly defined the type of object variables, the compiler has no other option than to use the late binding. The source code: ... Set out = CreateObject("Outlook.Application") Set mail = out.CreateItem(0) mail.Recipients.Add "marko@eset.sk" ... The corresponding p-code: ... 0007 001A LargeBos 0021 0009 F500000000 LitI4 00000000 000E 1B0000 LitStr "Outlook.Application" 0011 046CFF FLdRfVar loc_0094 0014 0A01000C00 ImpAdCallFPR4 rtcCreateObject2,0C 0019 046CFF FLdRfVar loc_0094 001C 045CFF FLdRfVar loc_00A4 001F FE4E SetVarVarFunc 0021 0018 LargeBos 0039 0023 284CFF0000 LitVarI2 loc_00B4,0000 0028 25 PopAdLdVar 0029 045CFF FLdRfVar loc_00A4 002C FF426CFF02000100 VarLateMemCallLdVar loc_0094,CreateItem,1 0034 042CFF FLdRfVar loc_00D4 0037 FE4E SetVarVarFunc 0039 001C LargeBos 0055 003B 3A4CFF0300 LitVarStr loc_00B4,"marko@eset.sk" 0040 25 PopAdLdVar 0041 042CFF FLdRfVar loc_00D4 0044 FF3D1CFF0400 VarLateMemLdRfVar loc_00E4,Recipients 004A FD9F LdPrVar 004C FE9805000100 LateMemCall Add,1 0052 351CFF FFree1Var loc_00E4 ... All the communication with the COM object is realized via the IDispatch interface. The rtcCreateObject2 function converts input ProgID to CLSID, creates an instance of the specified class and requests a pointer to the IDispatch interface. The methods and properties of this interface are then invoked by 'LateMem' instructions (i.e. their names contain the 'LateMem' string). All take a method/property name as an argument, convert it internally to DispID using IDispatch::GetIDsOfNames and then invoke the method/ property using IDispatch::Invoke. There are several 'LateMem' instructions. Methods are invoked by the 'LateMemCall' instructions, 'LateMemLd' instructions read properties and 'LateMemSt' write properties. If the instruction name is preceded by 'Var', the instruction takes the pointer to the class instance from the reference at the top of the stack. Otherwise, the pointer is read from a special internal variable set by the LdPrVar instruction (actually there are more 'Pr' instructions). On the other hand, if 'Var' is appended to the name, the instruction stores a result value in the specified variable. The 'LateMemCall' instructions also receive the number of arguments for the invoked method while the arguments themselves are prepared on the stack (PopAdLdVar). In the fastest form of early binding, vtable binding, Visual Basic uses an offset into a virtual function table (vtable). It needs to know the layout of vtable, IIDs of interfaces used etc., so appropriate references need to be selected in the project properties. The following code can be found in many worms, backdoors and other malware: PathName = App.Path & "\" & App.EXEName & ".exe" While the source code is pretty simple, the compiled code is a little more complicated and lengthy: 0000 0474FF FLdRfVar loc_008C 0003 0478FF FLdRfVar loc_0088 0006 050000 ImpAdLdRf Global:VBGlobal 0009 240100 NewIfNullPr Global:VBGlobal 000C 0D14000200 VCallHresult VBGlobal:App 0011 0878FF FLdPr loc_0088 0014 0D50000300 VCallHresult _App:Path 0019 6C74FF ILdRf loc_008C 001C 1B0400 LitStr "\" 001F 2A ConcatStr 0020 2368FF FStStrNoPop loc_0098 0023 046CFF FLdRfVar loc_0094 0026 0470FF FLdRfVar loc_0090 0029 050000 ImpAdLdRf Global:VBGlobal 002C 240100 NewIfNullPr Global:VBGlobal 002F 0D14000200 VCallHresult VBGlobal:App 0034 0870FF FLdPr loc_0090 0037 0D58000300 VCallHresult _App:EXEName 003C 6C6CFF ILdRf loc_0094 003F 2A ConcatStr 0040 2364FF FStStrNoPop loc_009C 0043 1B0500 LitStr ".exe" 0046 2A ConcatStr 0047 4644FF CVarStr 004A FCF654FF FStVar ... In vtable binding methods and properties are invoked by the 'VCall' instructions. All these instructions take an offset to vtable as the first argument. They take the pointer to the class instance from the internal variable (see previous section) and invoke a method/property at the specified offset in vtable. Since the return value is usually HRESULT the VCallHresult instruction is used most often. This is lucky indeed because this particular instruction takes a second argument. It is an index to ReferenceTable and points to the corresponding interface IID. The instruction verifies the returned HRESULT status internally and if it fails it invokes an error handler. The IID is supplied as an argument to the error handler. With both the vtable offset and the interface IID, it is relatively easy to look up a method/property name. The slower form of early binding is more similar to late binding and is seldom used in VB applications. The Automation COM components usually support dual interfaces (i.e. support both vtable and DispID binding), and VB compiler uses vtable binding whenever possible. There is a set of the 'LateId' instructions which are analogous to the 'LateMem' instructions, but they take the DispID argument instead of method/property name. There is probably no easy way to force VB to use DispID binding, but it is possible. For example, we modified the MS Outlook type library where we changed some of the dual interfaces to pure dispinterfaces. However, since this is clearly a 'laboratory' result we will not discuss it here. The study of the internals of Visual Basic executables is a rather complex issue. It is also a time-consuming endeavour, since the analysis requires extensive work with debuggers, disassemblers, hex browsers and similar tools. Visual Basic, being one of the most popular programming languages in the world, may well justify all the effort. The outline of our approach indicates that time-consuming and detailed analysis may, in fact, provide important clues to the VB application's interaction with the outside environment and prove to be instrumental in the development of efficient heuristic rules to identify malicious software generically. The future will show how successful we will be. |
|
[原创]****管理系统算法的详细分析
最初由 无奈无赖 发布 你这哪算爆啊,简直一TNT |
|
[求助]aspack2.12壳问题,百思不得其解,有人告诉我是自校验?
100%自效验 aspackdie 脱壳机还是不错的。 |
|
请问连连看2005V5.05加的是什么壳啊?
直接用LOADER,VB程序其实也很好脱的。 |
|
脱壳后出错求助
这种软件很多,自效验 |
|
|
|
请教:Armadillo脱壳遇到问题,就差一步了!!!
还有一个指针要修复的,注意这点就可以了,这个程序好像是与ARM注册机制结合的,去掉了壳,不需要破解应该就是注册版了。 |
|
[求助]哪个朋友破解过3DS MAX的脚本文件?
PHPX#263.net |
|
ASProtect 2.0x 脱壳后的修复问题
直接用LOADER了,没必要什么壳都脱。 |
|
超级内存修改工具
超级内存修改工具自然不是个简单的修改了。要不然怎么要加个超级 |
|
[下载]重新做完系统发现SICE好难找,斑竹勿忙删~给新人的工具~
最初由 天意2001 发布 你是调试器? |
|
|
|
关于木马分析专家会员版V6.35的脱壳
直接把这个拿出来。我给你瞧瞧 |
|
关于Armadillo4。0053
因为KEY要反过来填入的问题 |
|
fly 大哥能把你哪个脱这个壳的全过程序说一下吗?
是双进程的壳就看这篇文章 Armadillo COPYMEMEII之DUMP的一个LOADPE小插件 脱文修订加录像版Arm 3.x CopyMem-ll +Debug 结合起来看。看不懂就没办法了 |
|
[转帖]460M的一个脱壳视频教程――squadra handprot
464M的avi视频绝对是非MPEG4格式的,不然压缩率怎么可能这么大。 |
|
大家看看这是什么狗
SENTINEL.VXD 有模拟软件的,破解版比较难找。 |
|
[原创]我用汇编写的虚拟桌面,大家看还将就吧?
我有个可以隐藏你所能看到的任何东西,包括任务托盘,98年的程序,比较NB |
操作理由
RANk
{{ user_info.golds == '' ? 0 : user_info.golds }}
雪币
{{ experience }}
课程经验
{{ score }}
学习收益
{{study_duration_fmt}}
学习时长
基本信息
荣誉称号:
{{ honorary_title }}
能力排名:
No.{{ rank_num }}
等 级:
LV{{ rank_lv-100 }}
活跃值:
在线值:
浏览人数:{{ visits }}
最近活跃:{{ last_active_time }}
注册时间:{{ user_info.create_date_jsonfmt }}
勋章
兑换勋章
证书
证书查询 >
能力值