The last available version can be downloaded here. Don't expect too much, use v1.10 for serious work. Don't send me requests for features already available in v1.10. Please report only serious bugs and crashes. And, of course, new ideas are always welcome!
May 24, 2008
Internal emulation of simple commands (Options|Run trace|Allow fast command emulation) has made run and hit trace 15 (fifteen!) times faster. On my Athlon 4000+, standard run trace executes 35000 commands per second. With the emulation on, OllyDbg traces 500000 commands! For simple programs, this may be close to the real-time execution - in the step-by-step mode, with the full protocolling.
Emulation covers only the small subset of 80x86 commands - moves, PUSH/POP, arithmetical and boolean operations, comparisons, shifts, jumps, calls, returns and LEAs. No multiplications, prefixes, loops or string operations, no FPU or MMX; still, OllyDbg passes to the application less than two percents of commands.
Frequently one uses run trace together with the run trace condition, like: "stop trace when EAX==0x123456". Up to now, the inetrpreter parsed conditional expression on each step. However, this was too slow for the accelerated trace. Now I compile expressions to the simple pseudocode and use a very quick interpreter to estimate the condition. As a result, the above comparison is processed in only 130 nanoseconds. Not bad!
Oh yes, and command help now includes the string commands, too.
May 8, 2008
Improved and bug-fixed debugging engine. Help on all 8086 commands, except for string manipulations.
April 19, 2008
Pre-alpha 5: hit trace! Maybe you have tried to use hit trace in the version 1.10, only to discover that it doesn't work with non-trivial programs. Hit trace in version 2 is different: instead of replacing all recognized commands with INT3, I set breakpoints dynamically on all non-processed branches. It seems that 20-30 thousand breakpoints is not a problem for the new debugger. Also in this release: just-in-time debugging, command line, several bugfixes. Help on command is ready for all non-SSE/non-FPU commands till LEA.
March 11, 2008
Pre-alpha 4: name lists, search for text strings, floating-point constants and intermodular calls, run trace conditions, syntax highlighting (but default colours are not yet set), pause on thread, names of the arguments on the stack etc. The analysis of large modules is much faster now.
December 25, 2007
Pre-alpha 3: many different features like attaching to running process, detaching, run trace (as yet without fast stepping), real-time stack analysis, recognition of TLS callbacks, guarded memory, intermodular calls etc. Look at the comment column in the list of calls - you will enjoy it!
October 20, 2007
Removed 5 bugs; strongly improved functionality of existing windows; reduced number of false switches
Shame on me! In only one day, I have received 15 bug reports related to the v2.0 pre-alpha code! Most of them concentrate around the protection violation at address 477AC3 (a more or less obvious bug), but there are also other crashes reported. What should I say? Thank you! Without your steady help, OllyDbg 1.10 would never reach its actual quality. Hopefully, in some time second version will reach at least the same standards... Anyway, in the couple of weeks there will be update here. And - thank you again! Please keep it this way!
The child is big enough to show it to the public, so download this and have a look. Is this version functional? Yes. Is it better than 1.10? Definitely not. Is it better than v1.00? In some aspects - maybe, but in general - no. Can you use it for debugging? Yes, but you will miss many, many features... So please don't be too critical and send me no emails - this version is not even a full-featured alpha, and will change dramatically in the next several weeks or monthes. But in the case that OllyDbg will crash and generate errorlog.txt, be kind and do send me this file - I will need it for debugging. And now - enjoy!
Now OllyDbg 2 can save analysis data to the .udd files. Comparing to the previous version, they are very big - two to three times larger, mainly due to the register predictions. For almost every command I keep ESP and EBP relative to the entry point. Many modern compilers don't use standard stack frames; instead, they address all arguments and local data over ESP. Predictions allow to decode the meanings of ESP-related offsets. They are also very helpful when tracing the call stack.
It takes significant time to load such a huge amounts of data. First version took between 0.1 and 0.7 seconds per module. With full analysis of all modules requested (and this will be the default option), startup took several seconds on my Athlon 4000+. Now, after several days of deep optimizations, this time got three times shorter.
The progress in the last two weeks is enormous. List of Extremely Important Things To Do got five items shorter. But news of this kind are roughly equivalent to the summer headlines in the newspapers; now I want to tell you something different.
While testing MinGW compiler, I wrote a small console application:
int main() {
MessageBox(NULL,"I'm a little, little code in a big, big world... Hello, world!",
"Hello, world",MB_OK);
return 0;
}
Highly optimized release version of this code looks this way:
MinGW reserves space on the stack and moves arguments instead of pushing. But note the following: The order of arguments for MessageBox() is hOwner, Text, Caption, Type. MinGW has changed this order; still, OllyDbg 2 was able to recognize the arguments.
MinGW (in fact, GNU) is an excellent compiler, its only weakness is that many exotic APIs are not yet in the headers.
What is the birthday for a program? The day when it becomes useful for the first time. Today it happened. I have finished the debugging engine.
Well, not really. There are no memory or hardware breakpoints yet. OllyDbg can only set single-step traps and INT3 breakpoints, and run trace is not yet implemented. But all this is unimportant. I can step in and over, set conditions and log results; in brief - OllyDbg 2.0 has become a DEBUGGER. On such happy event, everybody wants to have a look on the newborn - here is a full-size picture:
The baby is almost new and can't take a walk, and it is very weak. This means, you must wait a bit longer till doctors will allow you to take it into hands. In my ToDo list, there are more than twenty items of priority AAA+ - Things To Be Done Before One Is Allowed To Even Wonder About Alpha Release, like:
precompiled table of known functions as resource;
recognition of functions that play with return address on the stack (like allocation of huge local data) - important, because a lot of sensible analysis depends on it;
comment operands of assembler commands - currently it's just a stub without intellect;
save data to .udd file;
on-line analysis of stack data;
copy modifications to executable file;
and so on, and so forth. So be patient, as ever :)
Finally I have finished the command search module. Basically, you supply a pattern, like XOR EAX,EAX, and OllyDbg locates all such commands in the memory block. Version 1.xx already featured this, but in a very limited form. For every supplied pattern, old program created the set of code/mask pairs and compared them with the binary code. This approach is simple and quick but features several drawbacks that strongly limited its usefullness. For example, if command is expected to have several prefixes, one must create models for any combination. But the main problem was that code/mask approach was unable to handle memory addresses. x86 addressing model is extremely complex and inhomogeneous, with many exceptions from the regular pattern. Let's take, for example, MOV EAX,[EBX]. There are 16 (sixteen) different binary encodings:
8B03 - the simplest form
8B43 00 - form without SIB with 1-byte zero displacement
8B83 00000000 - form without SIB with 4-byte displacement
8B0423 - form with SIB byte without scaled index
8B0463 - same
8B04A3 - same
8B04E3 - same
8B4423 00 - SIB byte, 1-byte displacement, no index
8B4463 00 - same
8B44A3 00 - same
8B44E3 00 - same
8B8423 00000000 - SIB byte, 4-byte displacement, no index
8B8463 00000000 - same
8B84A3 00000000 - same
8B84E3 00000000 - same
8B041D 00000000 - SIB byte, 4-byte displacement, scale 1, no base
Amazing, no?.. All attempts to reuse the old concept in the new OllyDbg version were in vain, so I was forced to throw it away. New model consists of the opcode, list of prefixes and packed description of operands. Search routine disassembles executable code and compares result with the model. Due to the very fast disassembler, this approach is almost as fast as the old one, but unbelievably flexible!
New search supports more pseudoelements than in the previous version:
R8 - any 8-bit register
R16 - any 16-bit register
R32 - any 32-bit register
REG - any general register (size is not important, assumed R32 in address)
RA,RB - semi-defined 32-bit registers
SEG - any segment register
FPUREG - any floating-point register
MMXREG - any MMX register
SSEREG - any SSE register
CRREG - any CR register
DRREG - any DR register
CONST - any constant
ANY - any operand or memory address (size is not important)
MOV ANY,ANY, for example, matches any MOV command:
MOV [ANY],ANY - all writes to the memory, dependless on the size:
Note that 16-bit address is included into the list. As you probably know, Windows reserves first 64 K of the process's memory as a trap for the NULL pointers, so flat-mode 16-bit access has no chances, with one important exception. Selector FS points to the thread's data block that keeps thread-dependent information available to the application.16-bit version is 1 byte shorter than its 32-bit countrepart (but may execute longer due to the additional prefix). By the way, first doubleword in TDB is the pointer to the Structured Exception Handling chain that implements try-catch constructs. It's easy to find all SEH chain changes with the single search for MOV [FS:ANY],ANY:
Search for XOR RA,RA finds all commands that zero some register by XORing:
whereas XOR XA,XB - cases where XOR just manipulates bits:
JMP [R32*4+CONST] will find table jumps, LEA RA,[RA*5] - fast multiplications of 32-bit register by 5 (of course, in reality this means [RA*4+RA]), and so on.
Oh, and I'm curious, how useful will you find this feature:
The development of version 2.0 goes steadily forward. In the last three monthes I have written more than 350 K of debugged code. Backup, search, jumps, history, conditional expressions, watches, Assembler - all the stuff necessary for productive work. And - for the first time, 2.0 has paused on the breakpoint!
Yes, this is a big step. This means that the infrastructure is ... well, not yet completed, but is already so stable that it can support complex high-level functions. When I browse through the sources, I'm full of pride that the code is so well-structured, logical and clear. Unfortunately, this was not the case with 1.10. Initial design had several flaws - in 2000, I had no experience and was unable to foresee the requirements of the final version. Every small modification required significant efforts and lengthy testing. So finally I've decided to close the project and rewrite it almost from the scratch.
The first steps of any redesign are very hard psychologically. Maybe you've experienced similar problems - you write loads, heaps, piles of code, but your project is almost dead. All it can is some primitive stuff, like it was in my case - disassembling of several hardcoded binary sequences, dumps of memory blocks at fixed addresses, provisorical code and debugging outputs everywhere, and next to this garbage there is your old version, five years of successfull development, maybe also full of trash inside but at least functional and with shiny storefront...
Anyway, I'm past this stage. OllyDbg 2.0 lives, and it makes plenty of fun again to develop. You've waited for so long - so be patient, please, and sooner or later I'll introduce you my promising younger son :)
Almost two years are gone since the last update of this page. But you don't forget me. The counter has crossed the magic limit of 1,000,000 impressions. So I feel me a bit ashamed and now will try to make up for your patience. Starting from now, every two or three weeks I will inform you here about the actual state of my work.
I'm frequently asked: "What happened to OllyDbg 2.0? Why is it not here?" Well, it is mostly my immanent laziness and, to lower extent, lots of other tasks and projects that have stopped the development of the second version. Nevertheless, it is not dead. In the last month I wrote more than 100 K of code, and now want to show you some highlights of the future version, mainly its new powerful analyser.
Despite highly complex features, like full code prediction, new version is significantly faster than its predecessor. But speed does not influence the quality of recognition. See, for example, how many calls were decoded by old OllyDbg in a large 3-MB application:
and by new:
Impressive, isn't it? Note that list of known functions in v2.0 currently includes only three system DLLs.
New version has strongly improved prediction of registers (especially ESP) and stack contents:
is able to recognize and decode register variables:
functions with variable number of arguments, like formats:
and cases when parameters are copied, rather than pushed, to the stack:
It determines loop variables, i.e. registers or memory items that change by the same amount on each loop iteration:
To help user, it even can rename and change decoding of arguments in some argument-depending cases:
New Analyser features also more reliable distinguishing between code and data. All in one, when OllyDbg will be ready, it will make debugging easier and understandable... I hope.