首页
社区
课程
招聘
[原创]破解还能生存多久?从代码迷惑技术谈起。
发表于: 2005-4-3 09:27 13274

[原创]破解还能生存多久?从代码迷惑技术谈起。

2005-4-3 09:27
13274
代码迷惑技术对破解的致命影响
作者:冲出宇宙
时间:2005.4.3
地点:中国

    这里是软件调试论坛,可是,我还是要来给大家敲敲警钟。当然了,我是从技术的角度给大家讲讲代码迷惑技术对我们破解者的致命影响。下面的内容很多来自于国外的论文,所以,也不能完全说是我原创的。
    代码迷惑技术(code obfuscation或者code mess-up或者obfuscation executive,即OE),在中文中翻译的比较混乱,有人翻译成代码混乱,有人翻译成代码混淆,我们以发表在《计算机研究与发展》上面的论文为参考,所以翻译成代码迷惑。顾名思义,代码迷惑技术就是专门和逆向工程作对的。总体上,代码迷惑可以在3个层次上进行:
    1.Disassembly(反汇编)。把机器代码转化为汇编代码。代码迷惑技术可以使用在汇编语言这个层次上。
    2.Decompilation(反编译)。把汇编代码转化为高级语言代码。代码迷惑技术同样可以使用在高级语言代码这个层次上。
    3.Design intent(设计意图)。代码迷惑技术甚至可以在进行工程设计的时候进行。
    这里不给大家说太多的技术,就和大家说说在汇编语言上面进行的代码迷惑技术给我们的一些影响吧。首先我们来看看目前对机器语言进行反汇编的几个算法:
    1.Linear Sweep(线性扫描):
    找到程序的enter point,然后对所有的在入口点和代码结束之间的机器代码一个个顺序进行直接变换,转化为汇编语言。问题很明显,当代码中出现花指令(即死指令(Junk code),可以扰乱反汇编算法的执行)时,线性扫描算法将不能正确解释剩下的所有机器代码。如果出现了代码中包含数据的情况,这种方式也不能很好的工作。目前流行的OllyDbg、IDA和W32Dasm等都不能很好的抗花指令保护。
    2.Recursive Traversal(递归行进):
    按照代码可能的执行顺序来反汇编代码。对每条可能的路径都进行扫描。本算法可以避免花指令和代码中包含数据导致的反汇编问题。同样,其问题有:我们不能精确的分析出程序所有可能的执行路径(当遇到非直接的跳转指令时。比如,跳转到一个地址,而这个地址(比如变量的地址)需要在程序运行的时候才能确定。)
    再次和大家说明,静态分析的时候IDA根本没有办法对付代码迷惑技术!后面会跟大家详细说明的。
    首先,看看对线性扫描的代码迷惑技术。在这个方面常用的只有一种,其最基本的形式可以描述如下:
    1  划分代码为许多小块。划分的标准是相邻的2块代码不能顺序执行。
    2  在每一块前面加入迷惑指令。所谓的迷惑指令就是只有一部分的汇编指令,不是完整的汇编指令。
    3  修改跳转地址等,进行善后处理。
    再看看对递归行进的代码迷惑技术。常用的有下面3种:
    1  分支函数(Branch functions):这种方式下,函数不是返回到原来地址的下一个指令执行,而是返回到其他地方。这点对于IDA来说是致命的。
    2  不变断言(Opaque Predicates):这种方式致力于增加分析的复杂度。比如,修改无条件跳转为条件跳转,但是跳转条件其实永远都是真的。
    3  跳转表欺骗(Jump Table Spoofing):把普通的无条件跳转伪造成switch的形式,欺骗反汇编器,让它以为有很多很多的分支。
   
   和大家说了这么多,我们来看看某个研究机构的实验结果:
   代码混乱算法:
   1  每小块前面加入Or指令的前面3个字节(显然是不完整的Or指令)。
   2  分支函数、不变断言和跳转表欺骗处理,在每个不可能运行到的路径上面都加入上面的普通迷惑指令。
   检验结果:
   1  线性扫描:出现70%左右的代码反汇编错误。
   2  递归行进:出现40%左右的代码反汇编错误。
   3  IDA Pro:出现90%以上的代码反汇编错误。
   为什么IDA这么次?原因在于IDA只反汇编它认为肯定是代码的地方。这样就导致2个结局:
   1  不能直接跳转到的地方它认为不是代码,会反汇编成数据。
   2  它不抗分支函数迷惑技术。

   写的太长了,该结束了。虽然我还有很多很多的话想说。上面给大家分析了一些抗静态分析的代码迷惑技术。或许大家还可以使用动态分析来分析程序的流程。但是,看完以后,你还是那么有自信么?你还认为你是高手么?希望那些软件作者没有看到这些东西。呵呵。

[课程]Linux pwn 探索篇!

收藏
免费 0
支持
分享
最新回复 (44)
雪    币: 151
活跃值: (66)
能力值: ( LV6,RANK:90 )
在线值:
发帖
回帖
粉丝
2
支持以下.
2005-4-3 09:35
0
雪    币: 519
活跃值: (1223)
能力值: ( LV12,RANK:650 )
在线值:
发帖
回帖
粉丝
3
这些手段....早就被用上了吧?
2005-4-3 10:00
0
雪    币: 519
活跃值: (1223)
能力值: ( LV12,RANK:650 )
在线值:
发帖
回帖
粉丝
4
Cracker也不是傻瓜,IDA看不出来的,人还看不出来吗?
IDA可以让你手动控制把某一段作为代码或数据来分析,这就足够了.
2005-4-3 10:03
0
雪    币: 200
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
5
受教了,谢谢楼主!・
2005-4-3 10:22
0
雪    币: 154
活跃值: (216)
能力值: ( LV4,RANK:50 )
在线值:
发帖
回帖
粉丝
6
一年多在看雪混的结果使我明白了抗破解与逆向工具和病毒与杀毒软件是一回事。任何抗破解都是以牺牲程序运行效率为代价而换来的
2005-4-3 10:44
0
雪    币: 339
活跃值: (1510)
能力值: ( LV13,RANK:970 )
在线值:
发帖
回帖
粉丝
7
2005-4-3 11:01
0
雪    币: 109
活跃值: (2158)
能力值: ( LV4,RANK:50 )
在线值:
发帖
回帖
粉丝
8
以上的技术并不是什么新技术了,都是已经在应用了。

即使是新技术,也并不可怕。要记住
vx=protect
av=crack

因为不可能存在无法对付的病毒一样,也就不可能存在无法破解的加密方法。

btw,这里说的加密(protect)是可执行文件的加密,数据加密另当别论。因为对于可执行文件的加密是有理论缺陷的,不管怎么加密,最终都还是要还原数据的。
2005-4-3 11:44
0
雪    币: 339
活跃值: (1510)
能力值: ( LV13,RANK:970 )
在线值:
发帖
回帖
粉丝
9
我觉得下面这一个人的研究,都比你那个研究机构做得好。

Vol. 2, No. 1 (2005)
http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
Anti Reverse Engineering Uncovered
Nicolas Brulez*
E-Mail: 0x90@Rstack.org
* Corresponding Author
Received: 07. Mar. 2005, Accepted: 12. Mar. 2005, Published: 13. Mar. 2005
This work has been previously published at the Honeynet Project, Scan of the Month 33.
Abstract
Rather than doing another complete analysis of the binary, i will rather present the techniques i have used in the
challenge, and how i have implemented them. The Scan of the Month 33 was released by the Honeynet Project in
November 2004. I invite everyone to read the excellent submissions we received this month once they have read
my paper. I am presenting the binary from the protection author point of view, while they presented it from the
analyst point of view. You will learn the methods and techniques used to Protect / Unprotect a binary with this
month's challenge. A lot of weaknesses were left on purpose in this binary and they will be presented here.
Keywords: Software Protection; Reverse Code Engineering; Linux; Anti-Debugging; Anti-Anti-Debugging
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
1. Introduction
This month's challenge is to analyze an unknown binary, in an effort to reinforce the value of reverse
engineering, and improve (by learning from the security community) the methods, tools and
procedures used to do it. This challenge is similar to SotM 32. However, this binary has mechanisms
implemented to make the binary much harder to analyze, to protect against reverse engineering.
Skill Level: Advanced/Expert
All we are going to tell you about the binary is that it was 'found' on a WinXP system and has now be
sent to you for analysis. You will have to analyse it in-depth and get as much information as possible
about its inner working, and what is the goal of the binary. The main goal of this challenge is to teach
people how to analyse heavily armored binaries. Such techniques could be used in the future, and its
time to get used to them.
2. Identify and explain any techniques in the binary that
protect it from being analyzed or reverse engineered
Many techniques have been used in order to slow down analysis and break reverse engineers tools:
• PE Header Modifications
Many fields of the PE header were modified in order to disturb analysing tools, and thus, the
Reverse Engineer. I will quickly cover the most important changes:
->Optional Header
Magic: 0x010B (HDR32_MAGIC)
MajorLinkerVersion: 0x02
MinorLinkerVersion: 0x19 -> 2.25
SizeOfCode: 0x00000200
SizeOfInitializedData: 0x00045400
SizeOfUninitializedData: 0x00000000
AddressOfEntryPoint: 0x00002000
BaseOfCode: 0x00001000
BaseOfData: 0x00002000
ImageBase: 0x00DE0000 <--- "Non Standard" ImageBase
SectionAlignment: 0x00001000
FileAlignment: 0x00001000
MajorOperatingSystemVersion: 0x0001
MinorOperatingSystemVersion: 0x0000 -> 1.00
MajorImageVersion: 0x0000
MinorImageVersion: 0x0000 -> 0.00
MajorSubsystemVersion: 0x0004
MinorSubsystemVersion: 0x0000 -> 4.00
Win32VersionValue: 0x00000000
SizeOfImage: 0x00049000
SizeOfHeaders: 0x00001000
CheckSum: 0x00000000
Subsystem: 0x0003 (WINDOWS_CUI)
DllCharacteristics: 0x0000
SizeOfStackReserve: 0x00100000
SizeOfStackCommit: 0x00002000
SizeOfHeapReserve: 0x00100000
SizeOfHeapCommit: 0x00001000
LoaderFlags: 0xABDBFFDE <--- Bogus Value
NumberOfRvaAndSizes: 0xDFFFDDDE <--- Bogus Value
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
The "standard" ImageBase usually is 400000 for Win32 applications and Reverse Engineers
are used to analyse programs with such an ImageBase. While it isn't a protection by itself, this
simple modification will confuse some Reverse Engineers, because they aren't used to such
memory addresses.
"Anti" OllyDbg:
LoaderFlags and NumberOfRvaAndSizes were modified.. I have Reverse Engineered
OllyDBG and Soft ICE to find a few tricks that could slow down the analysis of a binary.
With those two modifications, Olly will pretend that the binary isn't a good image and will
eventually run the application without breaking at its entry point. This could be a bad thing if
you wanted to debug a malware on your computer, because you would get infected.
Anti Soft ICE : Blue Screen of Death and no Chocolate:
The NumberOfRvaAndSizes field has been modified in order to reboot any computer
running a recent version of Soft ICE. While Disassembling the PE Loader of Soft ICE, i found
a very critical vulnerability in Soft ICE that allows one binary to crash any computer running
Soft ICE without any code execution. This vulnerability (bug) has been reported to
Compuware and should be fixed in the next version. Apparently it didn't happen on some of
the authors of the submissions for some reasons. Oh well.
Here is the disassembly of Soft ICE PE loader to find out why it reboots your computer:
.text:000A79FE
.text:000A79FE loc_A79FE: ; CODE XREF:
sub_A79B9+31j
.text:000A79FE ; sub_A79B9+3Cj
.text:000A79FE ; DATA XREF:
.text:00012F9Bo
.text:000A79FE sti
.text:000A79FF mov esi, ecx
.text:000A7A01 mov ax, [esi]
.text:000A7A04 cmp ax, 'ZM'
.text:000A7A08 jnz not_PE_file
.text:000A7A08
.text:000A7A0E mov edi, [esi+_IMAGE_DOS_HEADER.e_lfanew]
.text:000A7A11 add edi, esi
.text:000A7A13 mov ax, [edi]
.text:000A7A16 cmp ax, 'EP'
.text:000A7A1A jnz not_PE_file
.text:000A7A1A
.text:000A7A20 movzx ecx,
[edi+IMAGE_NT_HEADERS.FileHeader.NumberOfSections]
.text:000A7A24 or ecx, ecx
.text:000A7A26 jz not_PE_file
.text:000A7A26
.text:000A7A2C mov eax,
[edi+IMAGE_NT_HEADERS.OptionalHeader.NumberOfRvaAndSizes]
.text:000A7A2F lea edi,
[edi+eax*8+IMAGE_NT_HEADERS.OptionalHeader.DataDirectory]
.text:000A7A33 mov eax, ecx
.text:000A7A35 imul eax, 28h
.text:000A7A38 mov al, [eax+edi] ; CRITICAL BUG! One
can force EAX+EDI to be equal to zero. Reading at [0] in ring 0 isn't nice eh
;-)
.text:000A7A3B
.text:000A7A3B loc_A7A3B: ; DATA XREF:
.text:00012FA5o
.text:000A7A3B cli
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
.text:000A7A3C call sub_15C08
.text:000A7A3C
.text:000A7A41 mov byte_FA259, 0
.text:000A7A48 push eax ; Save EAX
.text:000A7A49 mov eax, dword_16B56F ; EAX is modified by
a saved dword
.text:000A7A4E mov dr7, eax ; Debug Register 7
take the value in EAX
.text:000A7A51 pop eax ; EAX is restored
.text:000A7A52 mov dword_FC6CC, esp
.text:000A7A58 mov esp, offset unk_FBABC
.text:000A7A5D and esp, 0FFFFFFFCh
.text:000A7A60 xor al, al ; AL is zeroed? Why
this mov al, [eax+edi] then ?
.text:000A7A60 ; I don't see the
point. old code?
.text:000A7A62 call sub_4D2EB
.text:000A7A62
.text:000A7A67 call sub_36AC1
.text:000A7A67
.text:000A7A6C xor edx, edx
.text:000A7A6E
.text:000A7A6E loc_A7A6E: ; CODE XREF:
sub_A79B9+124j
.text:000A7A6E call sub_74916
.text:000A7A6E
As you can see from the code above, we can force Soft ICE to read at memory location [0] or
something similar using a special value inside the PE header. For this binary i didn't bother
calculating the exact value to read at address [0], that's may explain why it didn't crash for
some people.I won't explain how to calculate this special value because it is trivial and i don't
want Darklords to use that trick without a little brainstorming.
To fix this problems, one needs to patch the value in the PE Header. The standard value for
NumberOfRvaAndSizes is 0x10.Just patch this value in the PE Header and the Soft ICE
wrecking will be gone. The OllyDBG problem as well, because it is based on BOTH fields
modifications. You can also nullify the other field if you want.
• Section Modification: Or how to kill many tools.
->Section Header Table
1. item:
Name: CODE
VirtualSize: 0x00001000
VirtualAddress: 0x00001000
SizeOfRawData: 0x00001000
PointerToRawData: 0x00001000
PointerToRelocations: 0x00000000
PointerToLinenumbers: 0x00000000
NumberOfRelocations: 0x0000
NumberOfLinenumbers: 0x0000
Characteristics: 0xE0000020
(CODE, EXECUTE, READ, WRITE)
2. item:
Name: DATA
VirtualSize: 0x00045000
VirtualAddress: 0x00002000
SizeOfRawData: 0x00045000
PointerToRawData: 0x00002000
PointerToRelocations: 0x00000000
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
PointerToLinenumbers: 0x00000000
NumberOfRelocations: 0x0000
NumberOfLinenumbers: 0x0000
Characteristics: 0xC0000040
(INITIALIZED_DATA, READ, WRITE)
3. item:
Name: NicolasB
VirtualSize: 0x00001000
VirtualAddress: 0x00047000
SizeOfRawData: 0xEFEFADFF <--- BIG Size of section on the disk.
PointerToRawData: 0x00047000
PointerToRelocations: 0x00000000
PointerToLinenumbers: 0x00000000
NumberOfRelocations: 0x0000
NumberOfLinenumbers: 0x0000
Characteristics: 0xC0000040
(INITIALIZED_DATA, READ, WRITE)
4. item:
Name: .idata
VirtualSize: 0x00001000
VirtualAddress: 0x00048000
SizeOfRawData: 0x00001000
PointerToRawData: 0x00047000
PointerToRelocations: 0x00000000
PointerToLinenumbers: 0x00000000
NumberOfRelocations: 0x0000
NumberOfLinenumbers: 0x0000
Characteristics: 0xC0000040
(INITIALIZED_DATA, READ, WRITE)
From those informations, we can conclude a few things. First, the binary doesn't seem to be
compressed, because the Virtual Address and Size matche the Raw Offset and Size at one
exception, the NicolasB section. This section has an extremly big size of raw data, which will
crash a few tools and make a few others very very slow.
IDA will try to allocate a LOT of memory because it thinks that the section is THAT big,
turning your computer into a very slow turtle ;-). Eventually, it will load the file, or run out of
memory, depending of the computer you are using to do the analysis.
This modification will also create havoc with many tools such as Objdump, PE editor, some
memory dumpers etc. It is very easy to fix this problem, you need to correct the Raw Size. If
you look at the section following this special one, you will find that it starts at the very same
Raw Offset. This means that the other section is actually null on the disk. You can therefore,
safely replace the big value by zero.
Protection Weakness:
While writing this binary, i knew people were going to patch the PE header but i didn't do any
integrity checks on purpose. Originally i wanted to use the value in the PE Header as keys to
decrypt a few layers of the protection, and the result would have been an unworking binary if
this one had been changed.
I have also changed a few other things in the PE header, but nothing of real interest here. (who
said Cosmetic?)
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
• Junk Code
All along the binary, i have added junk code between real instructions, in order to make the
analysis a little harder. The junk code are long blocks of code that does nothing but fancy
operations to disturb the analyst , especially when he choose to do a static analysis of the
binary. Each block of Junk Code is different and have been generated by a personal tool. A
Thrash generator which creates macros to be inserted in the code source around real
instructions.
Here is how it looks inside a disassembler:
The junk code starts with a pushad (save all registers states onto the stack) and finish with a
popad (restore register states).Here is the end of a block of junk:
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
Protection Weakness:
The thrash generator isn't perfect (at least with the options i have used here ;) and it is easy to
find the start and the end of a block of junk code. The junk code is bounded by pushad/popad.
When i wrote this binary i was aware of this problem, but this is a perfect real life example of
protection weakness. It allows Reverse Engineers to practice IDA/Ollydbg scripting. Very
interesting scripts were found in the submissions. I invite you to have a look at them if you
didn't know how to write one. When i wrote the binary, i already had a better version of my
Thrash generator that doesn't use any pushad/popad around the blocks of useless code, but we
will keep it for another challenge, if any.
• SEH - Structured Exception Handling
Windows SEH were used extensively in this binary. It allows one to access the context
structure of the current application, and therefore, access privileged registers such as Debug
Registers. Those registers are used by Hardware Breakpoints (BPM). If you can access them,
you can also erase the hardware breakpoints.
• Timing Detection Through SEH
Here is a little detection i invented to detect debuggers. If we merge SEH (And access to
context structure) with the known Timing Detection Technique, we can detect a lot of Ring 3
debuggers and Tracers. The idea is to read the Time Stamp Counter using RDTSC (number of
cycles executed by the CPU basically) and then generating an Exception.
In the exception handler, we can access the EAX register (previously modified by RDTSC) in
the Context Structure, which contains the TSC. In the Exception Handler, we use RDTSC one
more time, to get the current TSC value. Now, we can compare both TSC to see whether the
program has been debugged/traced or not. If such an action has occured, the difference of
cycles will be huges, thus triggering the Payload. In this binary, i just modified EIP through
the context structure. The application resumes at a different location skipping mandatory
instructions.The application crashes eventually. It seems that on some version on Windows, it
doesn't work as expected because of the utilisation of the CPUID instruction, that will modify
the ECX register.
The detection became less stealth because of this "bug", but it would still have been a matter
of time until someone discovered it anyway. Many people wondered why i used CPUID in the
program before RDTSC. The reason is that on recent CPU such as P4, there is a feature called:
Out of Order Execution. The CPUID is a synchronization instruction which tells to the CPU
not to use Out Of Order execution, avoiding False Positives in the debugger detection. If you
don't tell to the CPU not to use OOO execution, you don't know in which order the CPU is
going to execute your code. It can be different from your source code. Sometimes, it will
create a false positive and your program will crash for no reason.
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
Here is the code of this detection:
E0000h is the maximum cycles difference accepted by this detection. If the number is bigger,
then a debugger is most likely running and debugging our application.
Protection Weakness:
I have used a fixed value for the number of cycles: E0000h. I could have (Actually i can do it
with my layer generator) used a random value rather than a constant and therefore, making
the scan for this constant useless. I could also have used different instructions for each SEH to
make the creation of a generic pattern difficult. The biggest weakness of this detection is the
constant and the usage of the same instructions for every checks. It is also possible to write a
Kernel Module Driver to catch every execution of RDTSC (See Intel documentation for further
informations) and return very similar values, thus bypassing the detection completely.
• BPX Detection:
As we are going to use API functions, We have to protect them from beeing BPX'ed by an
attacker. Rather than Using GetProcAddress to get the API address and then to check for an
int 3 opcode (0xCC) in the API function code, i have used a different method. I directly access
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
the Import Table , more precisely, the Import Address Table to read the API function address
and then start to search for breakpoints.
The int 3 opcode is 0xCC and is known by Reverse Engineers. In order to make a little less
obvious, i have obfuscated the breakpoint check using a "SHR" (Shift Right) instruction:
0x660 shr 3 = 0xCC ;-). The program will then check four bytes at API function entry point,
looking for a breakpoint. If a breakpoint is found, i have used a funny way to crash the
application. Im using RDTSC to generate a pseudo random number and i put this number onto
the stack. To modify EIP, i simply use the RET instruction, which will transfer us to random
memory address, crashing our application. Each time a detection occurs, the address is
different, thus hard to monitor. The crash occurs far from the detection code and Soft ICE's
FAULT ON won't catch it either.
Protection Weakness:
First, the Imports aren't protected, therefore anyone can read the Imported functions from the
binary. From The import table we can see that printf, GetCommandLineA and ExitProcess are
used. This is a weakness. A Reverse Engineer can put breakpoints on those functions, or at
least, guess they are going to be used at some point. In the case of our binary, one can guess
that the application is waiting for a special command line. A solution would be to load the
Import Table manually.
For this we could use a home made GetProcAddress function to browse the Export Table of
the dlls we want to import functions from, and then, get the address of the API function from
there. A Kernel32 address is always on the stack when a binary is started, so we could have
used this value to get the dll's ImageBase (Or use the PEB, SEH chaining etc..). We would
have everything needed to get the address of Loadlibrary which allows us to Load ANY dll,
and thus, to get the address of ANY API function. With this method, we don't need any Import
Table at all.
Well actually, this isn't true. There is a mandatory thing to do to keep compatibility with all
versions of Windows. We have to create a very small Import Table, with at least ONE import
from Kernel32, else the binary won't run on Windows 2000. The Windows 2000 PE Loader is
different from the one in Windows XP. XP doesn't care whether there is any import table or
not.
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
The small Import Table is just for compatibility issue, the real import table is encrypted and
will be decrypted at runtime by the protection.
Then, it is just a matter of loading the Imports mimicing the Operating System. We need to put
the API address in the Import Address Table (of the decrypted Import Table) manually. The
Reverse Engineer has no clue about the API functions used by the binary until he gets to the
part of the code that will decrypt and load the Imports.
The BPX protection has a few weaknesses. I only check for four bytes at API entry point,
which can be easily bypassed, if the API has many instructions. One could put a breakpoint to
the first instruction after the 4 bytes boundary.A Better check would use a Length
Disassembler Engine (LDE) which tells us the size of the instructions. With this, we can safely
scan a lot of instructions without triggering any false positive.
A genuine instruction can contain the byte 0xCC and yet not beeing a breakpoint. Eg: Mov
eax, 0x4010CC. The detection would trigger a false positive on this instruction, because of the
0xCC inside of it. On the other hand, a LDE would tell us the size of this instruction (5 Bytes).
An int 3 (breakpoint) is either one or two bytes (0xCC or 0xCD 0x03). We would therefore
skip the current instruction and check the following one.
Also, the BPX check is only done once per API at a given location in the binary.Once we have
stepped over those checks , we can put a breakpoint on any API function without triggering
any error. This weakness wasn't fixed on purpose because this is a common error in
Protection Systems.
There is another kind of BPX detection that will be described in the next section
• The Crazy Layers
Here is a little more challenging protection. In order to protect the binary from beeing
disassembled, i have written an Encryption Layer generator, that will generate the number of
layers i want. For this binary, i used 175 layers. The Layer Generator has many options. Here
are the options from the config file: (0 means disabled)
SEH=1
RANDOM_LAYER_SIZE=0
RANDOM_REGISTERS=1
RANDOM_ENCRYPTION=0
ENCRYPTED_RETURN_ADDRESS=1
TIMING_DETECTION=1
RANDOM_CONSTANT=0
JUNKS=0
PUSHAD_POPAD=1
RANDOM_ORDER=0
USE_DIFFERENT_LOOP_CODE=0
RANDOM_FIRST_BLOCK=0
NUMBER=175
I will comment each options below:
SEH:
This tells to my layer generator to use (or not) SEH inside the layers.
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
RANDOM LAYER SIZE:
This tells to my layer generator to use a different size for each layer. This option wasn't
enabled to simplify the analysis.
RANDOM REGISTERS:
If this option is enabled, all the layers are using different registers. Some kind of
"polymorphism". This option was enabled.
RANDOM ENCRYPTION:
When this option is enabled, Each layer will have a different encryption algorithm. I didn't
enable this option. Therefore all the layers have a static encryption code. (Default layer)
ENCRYPTED RETURN ADDRESS
This option will encrypt the return address inside the layer. It avoids a simple patch to skip the
SEH.This option was enabled
TIMING_DETECTION:
Tells whether the layers must use Timing Detection or not. I enabled this option.
RANDOM_CONSTANT:
The Random constant is to tell whether we want to use a static value for the timing detection
or not. This option wasn't enable. All layers were using the defaut value: E0000h. Enabling
this option will also modify the code that checks for the Difference between both TSC.
JUNKS:
Enable of Disable Junks in the Layers. I disabled this option because the layers are WAY
biggers when it is enabled. The resulting binary is too huge and slow if you use a big number
of layers.
PUSHAD_POPAD:
This option tells the Layer Generator to use (or not) Pushad/Popad around the Junk Code. The
layer generator directly use the Thrash Generator (external tool) i have programmed. I was
using pushad popad in the junk code, that's why it is enabled. This option does nothing if the
Junks option is disabled.
RANDOM_ORDER:
Each layer use a table to access part of its code. If this option is enabled, Each layer has a
random order of execution. I didn't enable this one on purpose.
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
USE_DIFFERENT_LOOP_CODE:
Each layer loops a given number of time. With this option, one can use different code to test
the end of the loop. It makes it harder for the reverse engineer to find removal pattern. This
option wasn't enabled. A defaut checking code was used.
RANDOM_FIRST_BLOCK:
This option allows one to use random value inside the first elements of the layers tables. You
will see in some submissions that the static value were used to bypass the layers. I didn't
enable this option to see whether someone was going to use it or not.
NUMBER:
This is the number of the layer, the generator must use. I used 175 layers in this challenge. I
can generate 65000 layers in a few seconds because the generator engine is programmed in
Assembly Language.
Presentation of the encryption layers:
Layer Selector
xor esi,esi ; ESI = 0
mad_loop175_1: ; Loop label
inc esi ; ESI++
mov edi,dword ptr [ebp+(esi*4)+EIPtable175_1] ; Grab block address
mov ebx,dword ptr [ebp+(esi*4)+RETable175_1] ; Grab "Encrypted"
; Return address
Add ebx, [ebp+_startloader] ; Add Base.
push ebx ; Save Return Address
; from the stack
Call tricky_call175_1 ; Fake call
db 0EBh,01,0E8h ; Some junk crap
fake_ret175_1: ; fake return address label.
Add edi, [ebp+_startloader] ; Add EDI Base. EDI now
; contains address of a block
; inside the layer.
jmp edi ; Execute that block.
return_addy175_1:
cmp esi, 4 ; When we get back from the
;block, we check whether we
;have done every blocks.
jnz mad_loop175_1 ; if we didn't, loop!
bpxcheck175_1: ; Label used for BPX check.
jmp @layer175_1
tricky_call175_1:
pop ebx ; Ret address is in EBX
jmp fake_ret175_1 ; Jmp to fake return address.
@layer175_1: ; end of the layer.
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
This is the main part of a layer. This part loops through the layer blocks using some
obfuscated ways. It prepares the stack with return addresses, and fake a call. If you step over
with your debugger on this call, the binary won't break and it will run. If you were debugging
a malware, you would get infected. And if you were analysing the binary, you would need to
restart from scatch. (Except if you have dumped your position regulary).
Layers Blocks
include obfuscation/junk198.inc ; I have added a few junks macro
; manually in order to add a
; little fun :)
dec_loader175_1: ; Decrypt label
xor byte ptr [edx],cl ; Defaut options were used.
; Very simple encryption.
inc edx ; Code to decrypt++
dec ecx ; Loop index--
test ecx, ecx ; is ECX = 0 ?
jnz dec_loader175_1 ; no :( therefore we continue
; to decrypt.
; This encryption can be different for each layer if you enable the option in
the layer Generator.
lea edx , [ebp+bpxcheck175_1] ; Grab address of BPX check.
cmp byte ptr [edx],0CCh ; Any break point ?
jnz return175_1 ; no. Good boy.
rdtsc ; Ah.. he did put a bpx..
; EAX = random value
push eax ; push eax on stack
ret ; Return to it :) Crash the
; poor guy.
return175_1: ; on return block
include obfuscation/junk199.inc ; a few junk
include obfuscation/junk19A.inc ; ditto.
SEHBLOCK 66137317 28513829 ; SEH block macro with
; keys in parameters.
ret ; return
inst175_2_1: ; another block of code
add dword ptr [esp], 41952561 ; fix return address and return.
ret
inst175_3_1: ; Another block.
mov ecx, (offset _end174_1- @layer175_1) ; Get Size of layer
add dword ptr [esp], 13007360 ; Fix return address and return
ret
inst175_1_1: ; Another block
lea edx, [ebp+@layer175_1] ; Get Layer address
add dword ptr [esp], 30560857 ; Fix the return address
; and return.
ret
EIPtable175_1 dd 000DEADh, (offset inst175_1_1 - offset startloader), (offset
inst175_2_1 - offset startloader), (offset inst175_3_1 - offset startloader),
(offset _end174_1 - offset startloader)
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
; This is a table of offset used to redirect the code.
RETable175_1 dd 0031000h, (offset return_addy175_1 - offset startloader -
30560857) , (offset return_addy175_1 - offset startloader - 41952561),(offset
return_addy175_1 - offset startloader - 13007360),(offset return_addy175_1 -
offset startloader - 37623488)
; This is a table of return address with a little "encryption".
; You can notice the first member of the tables : DEADh and 31000h. Those
values are constants and can be random using the RANDOM_FIRST_BLOCK
; option in the layer generator.
The layer presented above has been generated by the little Layer generator Engine i have
programmed. I have added comments for the readers.
Protection Weakness:
Those layers have a few weaknesses. You can use BPM (Hardware Break Point) on the next
layer once you have passed the SEH that is going to clear the debug registers. Another
weakness is the static size of the layer. Using this information, one can pass the layers rather
quickly with a few Soft ICE macros for instance. I didn't turn the random size option on, on
purpose to allow such attacks.
Those layers always use the same encryption algo, which can allow one to write scripts to
decrypt the binary. And as you can read in a few submissions, some people did it. I did put this
weakness on purpose as well. In a challenge i had done in the past, i had used random
encryption for each layers, this time i choose not to use it. It is possible to bypass the 175
layers in a few seconds easily as well using a live approach. As we know wich API functions
are going to be used, we can set a break point after the BPX checks have occured.Another
possibility is to create a little utility that will PATCH the system dll in memory (each
application has a copy of the dll) and to redirect them to a place that you contol. This way you
can put breakpoints without triggering any Detection code.
Talking of patching the Windows dll files, it is possible to patch ntdll to avoid the Debug
Registers access in the context structure, by hooking the Exception Handling Mechanism of
Windows. This allows one to put Hardware Breakpoints anywhere without ever having
problems, never seeing his debug breakpoints beeing erased etc. The cool thing is you don't
even need a Kernel Mode Driver to do that. I leave this as an exercice for interested people.
• Virtual Machine
The final protection of the binary is a complete Virtual Machine i wrote for the challenge. I
have designed a Virtual CPU that will interpret my own Assembly language. The Virtual
Machine is quite simple to understand and isn't very complex.
Virtual Machines seem to be a new trend in protection systems, so i thought it could be a good
thing to write one for such a challenge. The instruction encoding is very trivial, and could have
been a lot harder to understand. The first Version i had in mind was a lot more complex. I
wanted not only to have a pseudo language, but also to program the instructions handlers
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
emulating real x86 instructions. Each handler would be a few hundred instructions long and a
lot harder to analyse.
A small program has been written with this Virtual Machine Assembly language, and it was
used to authenticate the user running the binary.
Read next part for further informations
3. Something uncommon has been used to protect the
code from beeing reverse engineered, can you
identificate what it is and how it works?
Even though, a few protection systems are using some kind of Virtual Machines, those aren't very
common. Especially in Malwares and other exploits.
Virtual CPU description and Inner working:
Registers:
REGISTERS STRUC
R0_ dd ? ; 000
R1_ dd ? ; 001
R2_ dd ? ; 002
COUNTER_ dd ? ; 003
EIP_ dd ? ; 004 -> reserved
STATE_ dd ? ; 005
REGISTERS ENDS
This is the original structure from my code source. Every registers is a DWORD. Some registers
weren't used because they are reserved for futur version of the Virtual Machine. One can read "EIP_".
I planned to add another information per instruction, but i didn't do it, because i didn't want it to be too
complex. I will add the ability to change the Instruction pointer for any instruction. The result will be a
completely mad code flow. The instruction order in the file will have nothing to do with the real
execution flow.
The STATE Register is some kind of mini Eflags. This register changes depending of other
instructions.
The COUNTER Register is used for loop instructions. Similar to the ECX register when we use the
LOOP instruction.
regs REGISTERS <>
R0 equ 000b
R1 equ 001b
R2 equ 010b
COUNTER equ 011b
EIP equ 100b
STATE equ 101b
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
Here are a few other definitions used in my program.I started to represent the registers in binary
because i wanted to do complex opcode decoding. I will do that for another version ;)
Registers Initialisation:
mov dword ptr [regs.R0_],"livE" ; Registers are initialized with a
; Slayer Song Title.
..
mov dword ptr [regs.R1_],"saH " ; Evil Has No Boundaries!
..
mov dword ptr [regs.R2_]," oN "
..
mov dword ptr [regs.COUNTER_],"nuoB"
..
mov dword ptr [regs.EIP_],"irad"
..
mov dword ptr [regs.STATE_],"! se"
At the start of the VM, i first begin to initialise my own registers with the song's title of a thrash metal
band. This title was selected because i planned to do real evil things with the Virtual Machine. It isn't
as hard as the initial version, but still evil enough to keep that funny string ;-).
There is about 34 Instructions in the Virtual Machine. (I count instructions having different utilisation
as unique)
I will present a few instruction handlers to explain the inner working of the Virtual Machine, but not
every instruction will be presented here.
Pcode Fectcher:
The first thing the Virtual Machine does after the Register Init is to Fetch the Pcode entry point and
jmp to the first Pcode handler.
movzx eax, byte ptr [esi] ; ESI is Pcode Entry Point. This code
; gets the first instruction Prefix.
mov edi, dword ptr [eax*4+poffset] ; It uses it with the offset table to
; find the Pcode family it has to
; execute.
movzx eax, byte ptr [esi+1] ; get second byte, use it as an
; index into last table.
; The VM now knows what instruction it
; has to emulate and goes to it.
JMPNEXT ; Emulate a jmp dword ptr [eax*4+edi]
; with Exception Handling and
; Context Manipulation.
; Jmp to the next Pcode
; instruction handler
Examples of Instructions implemented inside the Virtual Machine:
Before i start with those examples, i would like to say that a few instructions present in the Virtual
Machine weren't used and were left as decoy.Three of them are using Self modifying code. People are
reporting that they don't work, but they should. The off by one difference is because the opcode is
beeing called from other instruction handlers. Two instructions are modifying one instruction on the
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
fly as they need to execute a particular piece of code. They then restore the instruction state. I am too
lazy to check whether those instructions are really bugged or if they didn't use the good parameters.
One of the _unused_ instruction HAS a bug, and i am glad some people noticed it. The instruction isn't
used therefore, it is just a decoy instruction. The instruction is supposed to be a Virtual BSWAP, but it
doesn't save the result of the swaping. Another unused instruction is the INT 3. This instruction allows
one to put breakpoint in his Pcode program and trace with his debugger from that instruction. I left this
instruction in the final Virtual Machine and im glad some people found it and abused it!.
STOPVM
The first instruction i will present here is a very simple one. It tells to the Virtual Machine to stops and
the program will get back to normal x86 assembly program.
@STOPVM:
pop dword ptr fs:[0] ; Im using SEH to jmp from handlers to
;handlers in the VM.
add esp,4 ; Therefore i need to remove the handler
; installed before i do anything.
popad ; i restore the registers..
..
push dword ptr [Pret] ; Put the Return Address (to get out of
; the VM) on the stack.
..
xor [esp],'HAX0 '; Decrypt it with a funny string:
;HAXO(R)
..
ret ; Get out of the VM.
This instruction is not using any bytecode fetcher because it doesn't need to jmp to another handler. I
will now present a real instruction. A Virtual PUSH:
LOAD
@Load:
pop dword ptr fs:[0]
add esp,4
popad
; Same as every handler, remove SEH and restore registers.
mov eax,dword ptr [esi+2] ; Get into EAX the first operand
; of the instruction.
xor eax,37195411h ; Decrypt it.
push eax ; Push it onto the stack.
mov eax,0FFFFFF3Fh ; EAX = FFFFFF3Fh
not eax ; EAX = not(EAX) = C0h
shr eax,5 ; EAX = EAX shr 5 = 6 :
; This is the instruction length
lea esi, [esi+eax] ; ESI = Instruction Pointer.
; Deplace the Instruction Pointer
; 6 bytes further.
movzx eax, byte ptr [esi] ; ESI now points to the new
; instruction to be executed.
mov edi, dword ptr [eax*4+poffset] ; It uses it with the offset
; table to find the Pcode family
;it has to execute.
movzx eax, byte ptr [esi+1] ; get second byte, use it as an
; index into last table.
; The VM now knows what instruction it
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
; has to emulate and goes to it.
JMPNEXT ; Emulate a jmp dword ptr [eax*4+edi]
; with Exception Handling and
; Context Manipulation.
; Jmp to the next Pcode
; instruction handler
As you can see from this little handler, the instruction is 6 bytes long. It takes only one parameter and
it is placed 2 bytes after the start of the instruction. (ESI+2). The parameter is encrypted with
37195411h. The decrypted parameter is pushed on the stack and then the Virtual Machine calls the
next instruction.
From this, we can say that this instruction is a push. since push is already a x86 instruction, i named
my virtual push : LOAD.
One can use it like this: "LOAD number"
VMXOR
@VMXORDISPATCHER:
pop dword ptr fs:[0]
add esp,4
popad
; Same as every handler, remove SEH and restore registers.
movzx eax, byte ptr [esi+2] ; Get the Index Register to acces
; the Virtual CPU registers.
mov eax, dword ptr [regs+eax*4] ; edi = Register value to know
; which register is going to be
; concerned (RO, R1 , R2)
; EAX = value used by the XOR.
movzx ecx, byte ptr [esi+3] ; ECX = type of XOR.
; Byte ptr ? Word Ptr ? or
; Dword Ptr..
jmp dword ptr [xortable+ecx*4] ; Jmp to the good handler
; accordingly.
@VMXORBPTR:
movzx ecx, byte ptr [esi+4] ; Get Index Register for the
; destination.
mov ecx, dword ptr [regs+ecx*4] ; edi = Register value to know
; which register is going to be
; used (RO, R1 , R2)
xor byte ptr [ecx],al ; XOR BYTE PTR
add esi,5 ; Instruction Length is 5
movzx eax, byte ptr [esi] ; ESI now points to the new
; instruction to be executed.
mov edi, dword ptr [eax*4+poffset] ; It uses it with the offset
; table to find the Pcode family
; it has to execute.
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
movzx eax, byte ptr [esi+1] ; get second byte, use it as an
; index into last table.
; The VM now knows what
; instruction it has to emulate
; and goes to it.
JMPNEXT ; Emulate a jmp dword ptr
; [eax*4+edi] with Exception
; Handling and Context
; Manipulation.
; Jmp to the next Pcode
; instruction handler
@VMXORWPTR:
movzx ecx, byte ptr [esi+4] ; Get Index Register for the
; destination
mov ecx, dword ptr [regs+ecx*4] ; edi = Register value to know
; which register is going to be
; used (RO, R1 , R2)
xor word ptr [ecx],ax ; XOR WORD PTR
add esi,5 ; Instruction Length is 5
movzx eax, byte ptr [esi] ; ESI now points to the new
instruction to be executed.
mov edi, dword ptr [eax*4+poffset] ; It uses it with the offset
; table to find the Pcode family
; it has to execute.
movzx eax, byte ptr [esi+1] ; get second byte, use it as an
; index into last table.
; The VM now knows what
; instruction it has to emulate
; and goes to it.
JMPNEXT ; Emulate a jmp dword ptr
; [eax*4+edi] with Exception
; Handling and Context
; Manipulation.
; Jmp to the next Pcode
; instruction handler
@VMXORDPTR:
movzx ecx, byte ptr [esi+4] ; Get Index Register for the
; destination
mov ecx, dword ptr [regs+ecx*4] ; edi = Register value to know
; wich register is going to be
; used (RO, R1 , R2)
xor dword ptr [ecx],eax ; XOR DWORD PTR
add esi,5 ; Instruction Length is 5
movzx eax, byte ptr [esi] ; ESI now points to the new
; instruction to be executed.
mov edi, dword ptr [eax*4+poffset] ; It uses it with the offset
; table to find the Pcode family
; it has to execute.
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
movzx eax, byte ptr [esi+1] ; get second byte, use it as an
; index into last table.
; The VM now knows what
; instruction it has to emulate
; and goes to it.
JMPNEXT ; Emulate a jmp dword ptr
; [eax*4+edi] with Exception
; Handling and Context
; Manipulation.
; Jmp to the next Pcode
; instruction handler
From this piece of code we can learn many things. The XOR instructions are coded with 5 Bytes. It
has two parameters.One register has the value used to do the XOR and One Register has a pointer to
the location to be xored.It also have a byte saying whether it is a XOR BYTE PTR, a XOR WORD
PTR or a XOR DWORD PTR.
This instruction handler is therefore handling Virtual XOR instruction.
How were the virtual instruction used to create a program ?
I will now show how i did to create virtual instruction, because the x86 assembler doesn't know them
and will never compile a LOAD or a VMXOR.To do so, i used a very simple way: MACRO. For each
instruction, a corresponding macro has been created, and is used to encode the instruction for me. This
way i can write a program with my Assembly mnemonics without caring of the opcodes
representation.I will now show the 3 Macros used for the examples Virtual Instruction descrived
above
STOPVM macro
db 02,00
endm
This is the macro for the STOPVM instruction. Usage: STOPVM
Load macro x
db 00,00
dd x xor 37195411h
endm
This is the macro for the LOAD instruction. Usage: LOAD x
VMXOR macro reg0,kind,reg1
db 01,03,reg0,kind,reg1
endm
This is the macro for the VMXOR instruction. Usage: VMXOR Rx xPTR Rx
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
P-code Program used in this challenge:
The P-code is my own assembly language, thus IDA doesn't know anything about it. Here is how it
looks under a disassembler:
Ok, it doesn't look so good. So now, here is the complete program (copy pasted from my source) i
have written with my OWN assembly language. Cool isn't it ? :-)
CODE@:
Pcode1:
MOVE pcrypt COUNTER
LOADPTR startpcodecrypted
RestoreREG R0
MOVE 'S' R2
decryptpcode:
VMXOR R2 BPTR R0
INCR R0
DECR COUNTER
BNZ decryptpcode
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
startpcodecrypted:
MOVE 05CC80E31h R1
APICALL GetCommandLineA
SCANB " " 255h
BZ youwishdude
DLOAD R0 R2
ADDREG R2 01D9BDC45h
ADDREG R1 74519745h
SUBREG R2 0AD45DFE2h
ADDREG R1 0DEADBEEFh
ADDREG R2 "hell"
SUBREG R1 17854165h
SUBREG R2 "Awai"
ADDREG R1 "show"
ADDREG R2 "its "
SUBREG R1 " no "
ADDREG R2 "driv"
ADDREG R1 "merc"
SUBREG R2 "nuts"
SUBREG R1 "y!!!"
SUBREG R2 "eh?!"
ANDREG R2 0DFFFFFFFh
LOADREG R2
LOADREG R1
CMPQ firstcheckdone
CLEAR COUNTER
BZ youwishdude
firstcheckdone:
INCR R0
ADDREG R0 2
INCR R0
WLOAD R0 R1
LOADREG R1
RESTOREREG R2
LOADREG R0
LOADPTR tricky-98547h
RESTOREREG R0
ADDREG R0 98548h
DECR R0
VMXOR R2 WPTR R0
RESTOREREG R0
BLOAD R0 R2
ADDREG R0 2
BLOAD R0 R1
RADD R2 R1
VMCALL sub_check_routine
tricky:
ENCRYPTEDCLEAR COUNTER ; This one get patched at run time!
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
cracked:
LOADREG COUNTER
LOADPTR congrats
APICALL printf
CleanStack 8
BR outout
youwishdude:
LOAD 0
LOADPTR notgood
APICALL printf
CleanStack 8
outout:
STOPVM
sub_check_routine:
MOVE 'L' R1
INCR R1
INCR R1
ADDREG R1 5
DECR R1
SUBREG R1 4
SUBREG R2 'Z'
CMPREG R1 R2
BNZ youwishdude
DECR R0
BLOAD R0 R2
ADDREG R0 2
BLOAD R0 R1
RADD R2 R1
INCR R2
SUBREG R2 4Eh
LOADPTR retdecrypt-0DEADh ; push ptr to patch
RESTOREREG R0
ADDREG R0 0DEACh
INCR R0
VMXOR R2 BPTR R0
MOVE msgcrypt COUNTER
LOADPTR goodboy
RestoreREG R0
INCR R2
decryptmsg:
VMXOR R2 BPTR R0
INCR R0
DECR COUNTER
BNZ decryptmsg
retdecrypt:
VMRETCRYPTED
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
DATA@:
notgood db "Please Authenticate!",10,13,0
goodboy:
congrats db "Welcome...",10,13
db "Exploit for it doesn't matter 1.x Courtesy of Nicolas
Brulez",0
goodboyend:
This little routine is the password protection used in the binary. For more informations about it, i will
let you read the submissions. As you can see, the password protection is VERY SHORT. I could have
written a very complex algo with hundreds of lines to make it harder to analyse. Also this code is clear
of junk. I could also have placed P-code Junk instructions inside the program. The password check
was very simple and sadly, some people concentrated on the password rather than the Virtual
Machine. Next time i will make it a lot more complex so people has no choice but to analyse the
Virtual Machine and Instruction set.
You can compare the original P-code program here with the one inside the submissions to realize that
they have done a very good job.
A Few notes regarding the Password Protection
The password is checked using a very simple algorithm, but it is also used to decrypt yet another part
of the pcode program. There is a little weakness allowing one to find the correct value without any
brute forcing or analysis of the Opcodes:
Here is the encrypted String :
0x14, 0x26, 0x2F, 0x20, 0x2C, 0x2E, 0x26, 0x6D,
0x6D, 0x6D, 0x49, 0x4E, 0x06, 0x3B, 0x33, 0x2F,
0x2C, 0x2A, 0x37, 0x63, 0x25, 0x2C, 0x31, 0x63,
0x2A, 0x37, 0x63, 0x27, 0x2C, 0x26, 0x30, 0x2D,
0x64, 0x37, 0x63, 0x2E, 0x22, 0x37, 0x37, 0x26,
0x31, 0x63, 0x72, 0x6D, 0x3B, 0x63, 0x00, 0x2C,
0x36, 0x31, 0x37, 0x26, 0x30, 0x3A, 0x63, 0x2C,
0x25, 0x63, 0x0D, 0x2A, 0x20, 0x2C, 0x2F, 0x22,
0x30, 0x63, 0x01, 0x31, 0x36, 0x2F, 0x26, 0x39,
0x43
Everyone knows that a C String ends with a null byte. Therefore, the value used to encrypt this string
is 0x43. The key is the last byte of the encrypted string. X xor 0 = X. :-)
The other possible ways to find the good value was to look at the code structure.. We were doing a
Call routine, therefore we must have an instruction to do a RET. This instruction is the Virtual RET
implemented in the Virtual Machine. From this, we just had to find the opcode of this instruction to
compute the key.
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
4. Provide a mean to "quickly" analyse this uncommon
feature.
With this question, i was expecting a disassembler for my Virtual Machine. A few people sent me
fully working disassemblers, so i didn't write yet another one. I invite you once again to have a look at
their submissions.One of the author emailed me after the deadline with a working IDA processor
module with source code included. This Processor module wasn't used in my judgement because it
was sent after the deadline, but it is well worth studying it. It will be uploaded on the honeynet web
site shortly after the publication of the Results.
5. Which tools are the most suited for analysing such
binaries, and why?
In my opinion the best tools to analyse such binaries are Interactive Disassemblers or CPU emulators.
The disassembler can be used to analyse the code statically, to remove the obfuscations, to decrypt
binaries etc. If it offers possibility to write processor module, you can even write a disassembler for
the Virtual Machine and thus, do a full static analysis of the whole thing. A CPU emulator can be used
to quickly decrypt the code , layers etc. If it can be scripted not to show the obfuscations you have a
perfect weapon. I don't like Debuggers because they aren't reliable. I could have easily written a driver
to hook debug interupts to decrypt the binary for instance. Debuggers would have been useless and
would have rebooted the computer if used.
6. Identify the purpose (fictitious or not) of the binary.
This binary is waiting for an user to authenticate with a password that is passed to the application
through the command line. Once the user has been identified, the binary will print a little message. It
looks like a fake exploit. In the real world, it could have been a real exploit protected from prying
eyes.
7. What is the binary waiting from the user? Please detail
how you found it.
The binary is waiting for a password through the command line. The password is used to access the
real program. To find this password, you have to Reverse Engineer the binary. Decrypt every layers to
access the Virtual Machine. This Virtual Machine has a virtual program used to check the password
entered. One has to Reverse Engineer the Virtual Machine (or trace it blindly) in order to understand
its instruction set. Then it is just a very simple algo using a few easy operations to reverse. I invite you
to read submissions for details about the algo itself.
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
8. Bonus Question - What techniques or methods can you
think of that would make the binary harder to reverse
engineer?
This binary has a lot of security flaws that were left on purpose and it has a lot of things needing
improvements.
• Junk Code without pushad/popad
• P-code Junk code to really drive people tracing the protection nuts: 80% of useless
instructions would definitevely drive anyone mad.
• The encryption has a few weaknesses that were presented in the document. Mainly the static
encryption algo and the static size of every layers.
• Random constants in the first value of the address tables used by the Layers
• Better BPX detection, it could be greatly improved.
• The SEH could be used to initialise/decrypt part of the code in order to make sure they won't
be nopped out.
• Random Constants and code for the timing detection would make it harder to bypass with
scripts.
• The protection has been generated by my own tools and it is a bit repetitive. More variations
would make automatic removal harder.
• The Virtual Machine handlers are fairly simples. More code obfuscation (code flow and logic)
could be used.
• More Instructions in the Virtual Machine would have made it longer to analyse.
• Complex Opcodes encoding would make it quite challenging to Reverse Engineer.
• Utilisation of Cryptography rather than a simple algo to check for the serial number
• More Layers and different ones. There are a lot of ways to stop ring 3 debuggers that could
have been used to stop anyone trying to debug it.
• Imports Protection to make sure noone knows the API function used until he meets them in
the code.
• Emulation Macros to emulate simple x86 instructions. With those macros remplacing simple
instruction, it would be a lot harder to analyse. one instruction would have a 20 instructions
equivalent block of code for instance.
9. Conclusion
Anti Reverse Engineering Techniques can be used to really slow down the analysis of a binary.
Malwares could be using such techniques in a near futur and it is time to get used to it. Even though
most of the malwares are programmed by clueless idiots without any programming skill, there is a
minority able to write complex code. In the futur we could find exploit binaries on compromised
systems that would be protected against Reverse Engineering to hide the vulnerability exploited.
Spywares could also use such techniques to hide their activity. This binary had a lot of vulnerabilities,
yet it was really challenging , even with a trivial password protection algorithm. The protection has
been written within a week (a few hours per day), so with a little more effort, it can be a LOT
harder.Finally, I would like to point out that Reverse Engineering isn't a pirate technique and that it is
used by the Security Community on a daily basis. Some people in France doesn't seem to agree
though..
Vol. 2, No. 1 (2005), http://www.CodeBreakers-Journal.com
Copyright 2005 by the author and published by the CodeBreakers-Journal. Single print or electronic copies for personal use only are
permitted. Reproduction and distribution without permission is prohibited. This article can be found at http://www.CodeBreakers-
Journal.com.
Acknowledgements
I would like to thank the following people:
• The Honeynet Project and Lance Spitzner who allowed me to create the challenge.
• The authors of the submissions for taking the time to look at my binary and write a complete
report. Thank you.
• People at Datarescue for their Excellent tool IDA Pro used extensively while writing this
binary.
• You for reading this document
About the Author
Nicolas Brulez
Chief of Security for Digital River woking on the SoftwarePassport/Armadillo protection system, Nicolas
specializes in anti-reverse engineering techniques to defend against software attacks. He has been active in
researching viral threats and sharing that research with various anti-virus companies. He regularly writes for the
French security magazine MISC and has authored a number of papers on reverse engineering. He currently
teaches assembly programming and reverse engineering in French engineering schools.
The author has more than 7 years of Reverse Engineering Experience on Windows Operating Systems and is
currently doing Research on Pocket PC devices. He plans to write a Protection system for those devices.
2005-4-3 12:03
0
雪    币: 107
活跃值: (54)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
10
晕全是E文看不懂
2005-4-3 12:24
0
雪    币: 200
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
11
E文看不懂啊
2005-4-3 12:37
0
雪    币: 235
活跃值: (190)
能力值: ( LV12,RANK:210 )
在线值:
发帖
回帖
粉丝
12
呵呵,我看到这个英文不错哦。

正在看的过程中。

我上面写的技术是2003年发表在ACM上的一个论文和2004发表在ACM的一篇论文2者的结合。

这个英文是2005年的,我看看有什么弱点没有。呵呵。

谢谢了。
2005-4-3 13:03
0
雪    币: 339
活跃值: (1510)
能力值: ( LV13,RANK:970 )
在线值:
发帖
回帖
粉丝
13
弱点都介绍了。每项anti都有至少一种对付方法,防止你再说什么“破解还能生存多久”之类的
2005-4-3 13:16
0
雪    币: 235
活跃值: (190)
能力值: ( LV12,RANK:210 )
在线值:
发帖
回帖
粉丝
14
呵呵,已经把代码迷惑技术段读完了。

先说说上面的谈论到的代码迷惑技术。
"Junck code":这个是他谈论到的唯一一个技术。
你可以看看他是怎么进行迷惑的。
The junk code are long blocks of code that does nothing but fancy operations to disturb the analyst , especially when he choose to do a static analysis of the binary. The junk code starts with a pushad (save all registers states onto the stack) and finish with a popad (restore register states).
  呵呵,好搞笑啊!这种技术都称不上代码迷惑技术。加入垃圾指令技术是在70%以上的代码中加入一个不完全的指令,不是加入一大段代码!是只有一个指令的前面几个字节!
  这样的添加目的是为了迷惑静态反汇编。使得你压根就不能正确的反汇编出来。我这里没有提到动态发汇编技术,因为还没有研究到那么深入。
  2004年还有人就这个垃圾指令的问题写过博士论文呢!呵呵。

我愿意和大家进行基于垃圾指令的防御和攻击试验。
不过,我可能要过一个月才能写出一个好点的垃圾指令添加器。到时请大家捧场!谢谢了。
我要完成毕业论文。再次感谢。

上面的英文作者除了提到垃圾指令以外,还提到了虚拟机。这2种办法是理论上抗逆向工程的办法。

再次声明:
我写这个不是为了泼大家的冷水(我自己就是破解的支持者和实践者),我的目的只是希望大家提高警惕,继续提高自己的水平!保护和攻击本来就是一起前进的!
2005-4-3 14:01
0
雪    币: 2319
活跃值: (565)
能力值: (RANK:300 )
在线值:
发帖
回帖
粉丝
15
楼主那篇文章说的不是甚么新事物,
文章最后的一句话 “你还是那幺有自信幺? “  有点挑战别人的感觉
2005-4-3 14:01
0
雪    币: 235
活跃值: (190)
能力值: ( LV12,RANK:210 )
在线值:
发帖
回帖
粉丝
16
呵呵,我的目的是要写毕业论文。
所以,一定要有人出来对我的观点进行反驳,再给出证据我才能很快的进步!

还有,代码迷惑技术确实不是新东西。目前研究的最多的是Java和.net的Decompilation级别的迷惑技术。而第一个级别的迷惑技术由于和现实联系很紧,研究的水平还很低噢!

我先向大家汇报一下进展情况。

1997年 Collberg提出OE的概念。
1998年 Collberg给出了基于函数分块的迷惑算法。
2000年 Chenxi Wang(哈哈,好高兴,是中国人呢!不过,中文名字不清楚啊!)在他的博士论文中提出了一个新的基于函数分块的迷惑算法。
2002年 Gregory在博士论文中对目前所有的代码迷惑技术进行了分析和改进。
2003年 Linn综合改进了已经发表的各种抗静态分析的代码迷惑算法。同时进行了试验,结果我已经告诉大家了。
2004年 某位兄弟在其博士论文中给出了对代码迷惑技术的形式化描述,并改进了Java字节码的代码迷惑算法。这里省略其名字和学校,反正是所谓的很厉害的学校。
2004年 Java人员提出了抗基于高级语言反编译的一些原则。
2004年 Andrew Roach(刚开始看到这个名字的时候,我还以为是Andrew C.Yao呢!吓死我了。)对已经清楚的各种迷惑算法进行了介绍和分析。

现在我的目的就是如何抗击纯静态分析,如何让动态分析成为NP难问题,如何让还原原始代码成为幻想。呵呵,只要搞定一个就能毕业了!

请多多指教。谢谢了。
2005-4-3 17:35
0
雪    币: 200
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
17
最初由 lotusroots 发布
呵呵,已经把代码迷惑技术段读完了。
...
我愿意和大家进行基于垃圾指令的防御和攻击试验。
不过,我可能要过一个月才能写出一个好点的垃圾指令添加器。到时请大家捧场!谢谢了。........


其实真正有效的还是动态跟踪,所以我们加密的重点应该在如何防范动态跟踪,而不是静态分析。静态的指令干扰只能是一个小小的干扰,作为辅助手段而已。

你提到的通篇代码海量插入花指令,这虽然对静态干扰很有效,但代价是增加代码体积,甚至效率。

如果只是在关键代码处插入花指令反跟踪,那么更要谨慎处理,弄不好反而提醒了cracker,“这里开始将准备执行关键的代码”,这样不是得不偿失?

所以我想最后加密的重点还是在于如何对抗动态反跟踪上面。

一般性加密要点:
1)检查关键代码的执行时间是否过长:可检测到是否设置了断点或使用虚拟机环境执行。但探测结论不要随意显露,要不动神色。
2)对付SICE使用硬件调试寄存器设断点:对于敏感数据的使用采用动态随机的地址,保证每次执行分配的地址不同;多次搬移;使用公共区域交换数据,增加干扰源;增加无用的访问迫使频繁产生断点;
3)注意保护指纹的读取方法。比如并口加密狗要防止bpio 378这类的拦截。使用DMA传输是我自己的一个窍门(现在已经不用了)。
4)防止因为系统调用泄漏机密:操作系统一般限制用户层做太低级的操作,所以一般是通过操作系统API执行硬件或文件访问。这很容易被拦截的,从而被发现你读取了什么资源。要设法避免。比如使用特权,否则只能在读取后续步骤上弥补。
5)重要法宝:永远不要让轻易表明你已经探测到crack的行为。
2005-4-3 19:10
0
雪    币: 427
活跃值: (412)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
18
我管你什么混乱,多态,这些对我的爆破法完全无效。;)
2005-4-3 19:51
0
雪    币: 235
活跃值: (190)
能力值: ( LV12,RANK:210 )
在线值:
发帖
回帖
粉丝
19
呵呵,软件保护是保护算法和代码,不是说你的爆破。

只要有一个序列号,根本不用破解,大家都使用就行了。

代码保护的目的是让代码不被修改,里面的秘密(包括算法)不被泄漏,产生的数据不被人恶意修改。

已经有一点点想法了。
希望可以有进展。
2005-4-3 20:52
0
雪    币: 200
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
20
好东西,感谢!支持!
2005-4-3 21:00
0
雪    币: 264
活跃值: (34)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
21
最初由 lotusroots 发布
呵呵,软件保护是保护算法和代码,不是说你的爆破。

只要有一个序列号,根本不用破解,大家都使用就行了。


........


路过看了一下,最近在看dotnet的混淆器,应用了不少就是你所说的那些能够放在论文里臭屁的听起来毛骨悚然的各种混淆方法。最后呢其实保护对于动态分析都是脆弱的,正如前面freeeman所说的。重要的还是在防止动态上下功夫,静态的混淆虽然能够起到一定的防护,但是对于破解而言更多的我们只需要找到那关键的一跳(泛指):)。对于保护某些关键技术不被窃取,也就是防止逆向还是有一定作用的
2005-4-3 21:56
0
雪    币: 427
活跃值: (412)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
22
最初由 菩提! 发布


路过看了一下,最近在看dotnet的混淆器,应用了不少就是你所说的那些能够放在论文里臭屁的听起来毛骨悚然的各种混淆方法。最后呢其实保护对于动态分析都是脆弱的,正如前面freeeman所说的。重要的还是在防止动态上下功夫,静态的混淆虽然能够起到一定的防护,但是对于破解而言更多的我们只需要找到那关键的一跳(泛指):)。对于保护某些关键技术不被窃取,也就是防止逆向还是有一定作用的


我不这样认为,因为我有见过反混淆器。让不可读读代码能变的比源代码还具有可读性。
2005-4-4 20:36
0
雪    币: 282
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
23
破解?只是个时间问题
2005-4-4 23:55
0
雪    币: 200
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
24
大开眼界啊,有阻碍才有发展嘛
2005-4-5 00:28
0
雪    币: 200
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
25
是够长的了!
2005-4-5 08:31
0
游客
登录 | 注册 方可回帖
返回
//