首页
社区
课程
招聘
PDA寄存器全面分解
2004-10-29 19:02 6614

PDA寄存器全面分解

nbw 活跃值
24
2004-10-29 19:02
6614
把PDA芯片的结构和使用方法都谈了,还有逆向方法。虽然用不上,但是感觉很不错的说。

    Disassembling and The Analysis of ARM Processors   
  

RISC processors are used in many small devices such as PDA, mobile phones, clever coffee-machines etc. There is a big variety of assemblers for RISC processors, but the most frequent one now is ARM.

I am going to talk about ARM 7 since I had a deal with them.

  

Let's begin with ARM architecture. ARM processor has a total of 37 registers: 31 general-purpose 32-bit registers and 6 status registers. Set of available registers depends on processors state. ARM state executes 32-bit instructions, Thumb executes 16-bit ones.

  

In ARM state 18 registers are available: directly accessible R0-R15, CPSR (current program status register), SPSR (status of saved program). 3 of directly accessible registers can be called service-purpose:

  

(R13) SP - stack pointer

(R14) LR - link register, the special register for storage of the return address when procedures are being called. I.e. LR is not saved in the stack - it just lies in the register.

(R15) PC - a pointer to the current command. It is possible to write to it by ordinary mov changing thereby the address of the next command to be executed.

  

In Thumb state 13 registers are available: R0-R8, R13-R15, CPSR, SPSR.

  

Transition between the states doesn’t change the contents of the registers.

  

Entry into Thumb state can be achieved by executing a BX instruction with the state bit (bit 1) set in the operand register. Entry into APM state can be achieved by executing a BX instruction with the state bit (bit 0) set in the operand register.

  

Set of commands in both states differs, but many commands are still similar. Commands of Thumb state have length of 2 bytes, ARM - 4 bytes. The description of commands of Thumb and ARM states can be taken here: http://www.atmel.com/dyn/resources/prod_documents/doc0673.pdf

  

Especially interesting is that many commands operate with several registers at once. For example:

  

ADD     R3, SP, #4

  

That maps to

  

R3:=SP+4

  

Or, for example, a command of storing the registers to the stack:

  

PUSH {R2-R4, R7, LR}

  

It is not an analogue of pushad in x86 assembler. Just in ARM assembler it is possible to push the list of registers onto the stack in such way.

  

The data in memory can be either little endian (as at Intel) or big endian (as at Motorola). So, while investigating a code it is necessary to be determined with the data type.

  

There is a pile of compilers for development of programs for ARM:

http://heanet.dl.sourceforge.net/sourceforge/gnude/gnude-arm-win.exe  - GNU compiler with all consequences - all through command line + debugging through gdb.

http://www.goldroad.co.uk/grARM.html - unpretentious ARM assembler.

http://www.arm.com/support/downloads/index.html  - official tools for ARM’s develpment. Here you can only buy them.

http://www.iar.com/  - alternative to IDA for ARM. 30-day's trial version is offered.

  

Features of ARM assembler which is generated by C ++ ARM compilers.

  

Naturally, on analysis of different weavings code person faces not with the code written on pure assembler, but with C++ ARM compiler generated on the code, and of course it’s a surprise for those who had accustomed to x86 assembler.

  

Functions calls

  

There are no call conventions (cdecl, stdcall and so on) at all! All the functions use the convention similar to Borland's fastcall. I.e. firstly registers, and if it isn't enough of them, parameters are being passed via stack.

  

For example:

  

ROM:0001F4E2 MOV R0, SP

ROM:0001F4E4 MOV R2, *6

ROM:0001F4E6 ADD R1, R4, *0

ROM:0001F4E8 BL memcmp

  

The order of parameters passing maps to registers’ numbers, i.e. R0 is the first, R1 is the second, R2 is the third. That is for

  

int memcmp (

   const void *buf1,

   const void *buf2,

   size_t count

);

  

buf1 = R0

buf2 = R1

count = R2

  

value returned by the function is being passed via R0:

  

ROM:0001F4E2 MOV R0, SP

ROM:0001F4E4 MOV R2, *6

ROM:0001F4E6 ADD R1, R4, *0

ROM:0001F4E8 BL memcmp

ROM:0001F4EC CMP R0, *0

ROM:0001F4EE BNE loc_1F4F4

  

Here is the call with passing via the stack:

  

ROM:000BCDEC MOV R2, *0

ROM:000BCDEE STR R2, [SP]

ROM:000BCDF0 MOV R2, *128

ROM:000BCDF2 MOV R3, *128

ROM:000BCDF4 MOV R1, *14

ROM:000BCDF6 MOV R0, *0

ROM:000BCDF8 BL FillBoxColor

  

So, R0-R3 contain coordinates and the fifth parameter (color) is being stored to the stack.

  

The number of operands can be determined only analytically, i.e. we have to analyze the function call and its prologue. Partly, info on the arguments quantity can be received reasoning from which registers from function onset are being stored to the stack. For example, in Thumb state the processor operates with registers R0-R7 and service-purpose ones. So, after having noticed a function, which begins with

  

ROM:00059ADA getTextBounds                          

ROM:00059ADA PUSH {R4-R7, LR},

  

you can assume that it gets arguments via R0, R1, R2, R3 and SP. Further on a call:

  

ROM:0005924E ADD R0, SP, *0x14

ROM:00059250 ADD R1, SP, *0x6C

ROM:00059252 ADD R2, SP, *0x68

ROM:00059254 ADD R3, SP, *0x64

ROM:00059256 BL getTextBounds

  

we see that only R0-R3 are used. That means that 4 parameters are being passed.

  

Transitions

  

As usual, transitions aka jumps can be conditional and unconditional. The transitions themselves can be relative and register. At that, register ones are often used for switching between Thumb/ARM state. Unconditional short transitions are embedded as command B (branch). And long ones - via register transition BX (Branch with exchange). Function calls are being performed via BL (Branch with link), i.e. transition with storing the return address to LR. Also it is possible to change the performance address by writing in PC register:

  

ADD PC, *0x64

  

But C compilers usually do not work in such way. They use writing in PC only in branchings.

  

Branches

  

Also called switch. They are embedded rather originally:

  

ROM:0027806E CMP R2, *0x4D; 'M'

ROM:00278070 BCS loc_27807A

ROM:00278072 ADR R3, word_27807C

ROM:00278074 ADD R3, R3, R2

ROM:00278076 LDRH R3, [R3, R2]

ROM:00278078 ADD PC, R3

ROM:0027807A

ROM:0027807A loc_27807A                              

ROM:0027807A B loc_278766

ROM:0027807C word_27807C DCW 0xAA, 0xBE, 0xC6, 0x180, 0x186; 0

ROM:0027807C DCW 0x190, 0x1A0, 0x1A8, 0x1DE, 0x1E4; 5

ROM:0027807C DCW 0x1B0, 0x212, 0x276, 0x1FE, 0x294; 10

  

First there is a check of the case number takes place. It must be less than 0x4D. If the case number is higher, switch on default case happens, i.e. on loc_27807A.

  

Further the address of branches table word_27807C is being taken. In this table lie offsets, not branches addresses! And further on a case index the necessary offset is being extracted and being added to PC. That is for case 0 there will be a switching to the address

  

0x278078 (current value PC) +0xAA (offset from the table) + 0x4 (!!!) = 0x278126.

  

We have to add 4 because of ARM processors’ characteristics: when an operation with PC register is being performed, the result is higher by 4 (as it is written in documentation - " to ensure it is word aligned ").

  

Access to memory

  

In Thumb state processor can address to memory in +/-256 bytes limit. Therefore access to memory occurs not directly, but via register loading. I.e. it is impossible to address directly to 0x974170, but it can be done via the register. For example:

  

ROM:00277FF6 LDR R0, =unk_974170

ROM:00277FF8 LDR R0, [R0]

  

We have received value to the address 0x974170. But we haven't finished yet! The address of a variable (0x974170) is stored nearby within the 256 bytes limit:

  

ROM:00278044 off_278044 DCD unk_974170

  

That is, in fact, opcode of LDR command contains an operand offset for LDR command relatively to the current address.

  

There is an artful property of optimization: if any address can be received relatively to another already used in the current function, then it can be get by arithmetic operations or indirect access. It means that if function, for example, wants to use one variable on the address 0x100000, and another one on the address 0x100150, then the compiler can make access either through two separate addresses or through the following code:

  

LDR R0, =0x100000

ADD    R0, *0xFF

ADD    R0, *0x51

LDR R0, [R0]

  

In x86 it would be treated as the reference to a substructure within the other structure. But here we see usual optimization. What for? To minimize access to memory. I.e. arithmetics works faster than data loading. As a matter of fact, the whole ARM assembler code abounds in different register calculations. Actually, as many as 16 registers were made just for this - to address less often to memory and the stack. For this reason stack variables can be met only in very big functions. Working with the stack differs nothing from the analogous procedure in x86.

  

Code investigation in IDA

On loading ARM binary images it is necessary to load them as binary files since they do not have a unified structure. On loading you have to specify type of the processor. If the processor for which the code was written is absent in the list of processor modules, then you can load an image file and specify the general type of ARM processor (little endian) or ARMB (big endian). Further it is necessary to create ROM and RAM segments. There is no unified approach. This must be done in depending of an image and architecture of each separate ARM processor. For example, for ARM7 the memory card has nearly the next look:

  

0x0 - 0x8000 of RAM processor

0x8000 - 0x1000000 ROM

0x1000000 - 0x..... - SRAM (here looking how much of it the device has)

  

Now we can start the analysis of a code. A point of an input in the weaving code in many devices (in particular, in mobile phones) = 0x8000. The processor starts from ARM state so that a code on the 0x8000 address is equal to the code of ARM state. Processor module IDA is rather primitive and very frequently in attempt of the analysis of such switching, plenty of Thumb code is being transformed in ARM (and on the contrary). Manually to switch a state of a code you can by pressing ALT-G and entering zero in the field Value for ARM state and 1 - for Thumb.

[CTF入门培训]顶尖高校博士及硕士团队亲授《30小时教你玩转CTF》,视频+靶场+题目!助力进入CTF世界

收藏
点赞1
打赏
分享
最新回复 (6)
雪    币: 85263
活跃值: (198560)
能力值: (RANK:10 )
在线值:
发帖
回帖
粉丝
linhanshi 2004-11-13 14:27
2
0
支持!!!
雪    币: 3758
活跃值: (3212)
能力值: ( LV15,RANK:500 )
在线值:
发帖
回帖
粉丝
曾半仙 12 2004-11-15 14:03
3
0
来支持以下哈
介个~~标题改一下吧.
叫别人看了不免影响看学的形象啊
雪    币: 339
活跃值: (1510)
能力值: ( LV13,RANK:970 )
在线值:
发帖
回帖
粉丝
nbw 24 2004-11-15 23:50
4
0
最初由 曾半仙 发布
来支持以下哈
介个~~标题改一下吧.
叫别人看了不免影响看学的形象啊


为啥改标题啊?PDA还有其他意思?
雪    币: 200
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
evanjht 2004-12-2 14:03
5
0
8错的说,支持一下:D
雪    币: 200
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
DamnYa 2004-12-18 17:01
6
0
写ARM芯片详解不就完了,PDA寄存器总感觉别别扭扭
雪    币: 116
活跃值: (220)
能力值: ( LV12,RANK:370 )
在线值:
发帖
回帖
粉丝
xIkUg 9 2004-12-18 17:03
7
0
最初由 DamnYa 发布
写ARM芯片详解不就完了,PDA寄存器总感觉别别扭扭


同意。。。PDA上用的芯片各种各样

nbw有空翻译一下吧
游客
登录 | 注册 方可回帖
返回