首页
社区
课程
招聘
[翻译]ARM汇编简介(二)ARM指令集
2018-6-15 17:17 7969

[翻译]ARM汇编简介(二)ARM指令集

2018-6-15 17:17
7969
本篇是ARM系列基础教程的第三篇,ARM指令集。原文链接 https://azeria-labs.com/arm-instruction-set-part-3/

ARM & THUMB    ARM和THUMB


ARM processors have two main states they can operate in (let’s not count Jazelle here), ARM and Thumb. These states have nothing to do with privilege levels. For example, code running in SVC mode can be either ARM or Thumb. The main difference between these two states is the instruction set, where instructions in ARM state are always 32-bit, and  instructions in Thumb state are 16-bit (but can be 32-bit). Knowing when and how to use Thumb is especially important for our ARM exploit development purposes. When writing ARM shellcode, we need to get rid of NULL bytes and using 16-bit Thumb instructions instead of 32-bit ARM instructions reduces the chance of having them.

ARM处理器有两种可让我们操作的状态(我们就不把 Jazelle 考虑在内了),他们是ARM和Thumb。这些状态和特权级别没任何关系。比如,在SVC模式下运行代码即可以是在ARM状态也可以是Thumb,主要的区别在于指令集,ARM状态下的指令集总是32位的,而Thumb下则是16位(也可能是32位)了解thumb指令在哪里使用,如何使用,对于我们达成开发目标而言是相当重要的。当我们编写ARM代码时,我们需要去掉空字节,并用16位的Thumb指令代替32位的ARM指令来减少获取到他们的机会

The calling conventions of ARM versions is more than confusing and not all ARM versions support the same Thumb instruction sets. At some point, ARM introduced an enhanced Thumb instruction set (pseudo name: Thumbv2) which allows 32-bit Thumb instructions and even conditional execution, which was not possible in the versions prior to that. In order to use conditional execution in Thumb state, the “it” instruction was introduced. However, this instruction got then removed in a later version and exchanged with something that was supposed to make things less complicated, but achieved the opposite. I don’t know all the different variations of ARM/Thumb instruction sets across all the different ARM versions, and I honestly don’t care. Neither should you. The only thing that you need to know is the ARM version of your target device and its specific Thumb support so that you can adjust your code. The ARM Infocenter should help you figure out the specifics of your ARM version (http://infocenter.arm.com/help/index.jsp).
ARM版本的调用规则让人难以感到困惑,并且不是所有的ARM版本都支持同样的Thumb指令集。在某个时间点,ARM引入了一个增强的Thumb指令集(伪名称:thumbv2),它允许执行32位Thumb指令,甚至是条件执行指令,这在之前的版本中是不可能的。为了在Thumb状态下使用条件执行,我们引入了“it”指令。然而,这一指令在后来的版本中被删除,并且被换成了可能让事情变得不那么复杂的指令集,然而却达到了相反的效果。我不知道所有不同ARM版本的ARM/Thumb指令集的所有不同的变化,我是真的不在乎。你也不应该在乎。唯一需要知道的是你的目标设备的ARM版本及其支持的Thumb版本,这样你就可以调整你的代码了。ARM息中心应该帮助你确定ARM版本的细节 (http://infocenter.arm.com/help/index.jsp). 。

As mentioned before, there are different Thumb versions. The different naming is just for the sake of differentiating them from each other (the processor itself will always refer to it as Thumb).

Thumb-1 (16-bit instructions): was used in ARMv6 and earlier architectures.

Thumb-2 (16-bit and 32-bit instructions): extents Thumb-1 by adding more instructions and allowing them to be either 16-bit or 32-bit wide (ARMv6T2, ARMv7).

ThumbEE: includes some changes and additions aimed for dynamically generated code (code compiled on the device either shortly before or during execution).

如前所述,有不同的Thumb版本。不同的命名只是为了区分它们(处理器本身总是把它视为Thumb指令)。
1. THEMP-1(16位指令):用于ARMv6和早期的体系结构。
2. THEMP-2(16位和32位指令):通过添加更多指令扩展了THUMP-1指令集,并允许它们既可以是16位也可以是32位位宽的指令(分别对应ARMV6T2,ARMV7)
3. ThumbEE:包括针对动态生成代码的一些更改和添加(在执行之前或执行过程中在设备上编译的代码)。

Differences between ARM and Thumb:

Conditional execution: All instructions in ARM state support conditional execution. Some ARM processor versions allow conditional execution in Thumb by using the IT instruction. Conditional execution leads to higher code density because it reduces the number of instructions to be executed and reduces the number of expensive branch instructions.

32-bit ARM and Thumb instructions: 32-bit Thumb instructions have a .w suffix.

The barrel shifter is another unique ARM mode feature. It can be used to shrink multiple instructions into one. For example, instead of using two instructions for a multiply (multiplying register by 2 and using MOV to store result into another register), you can include the multiply inside a MOV instruction by using shift left by 1 -> Mov  R1, R0, LSL #1      ; R1 = R0 * 2

ARM和Thumb指令之间的差异:
1. 条件执行:ARM状态中的所有指令均支持条件执行指令。一些ARM处理器版本允许使用IT指令在Thumb模式中执行条件指令。条件执行导致了更高的代码密度,因为它减少了要执行的指令的数量,并减少了耗费资源更多的分支指令的数量。
2. 32位ARM和Thumb指令:32位Thumb指令有一个.w后缀。
3. 循环移位是ARM模式的另一种独特特征 。它可以用来将多个指令缩为一个。例如,与使用两个指令实现乘法相比(用寄存器里的数乘以2,并使用MOV将结果存储到另一个寄存器中),你可以使用左移一位来在MOV指令中包含乘法指令 -> 
Mov  R1, R0, LSL #1      ; R1 = R0 * 2

To switch the state in which the processor executes in, one of two conditions have to be met:

We can use the branch instruction BX (branch and exchange) or BLX (branch, link, and exchange) and set the destination register’s least significant bit to 1. This can be achieved by adding 1 to an offset, like 0x5530 + 1. You might think that this would cause alignment issues, since instructions are either 2- or 4-byte aligned. This is not a problem because the processor will ignore the least significant bit. More details in Part 6: Conditional Execution and Branching.

We know that we are in Thumb mode if the T bit in the current program status register is set.

要想切换处理器执行的状态,必须满足两个条件中的一个:
1. 我们可以使用分支指令BX(分支和交换)或BLX(分支、链接和交换),并将目的寄存器的最低有效位设置为1。这可以通过添加1的偏移来实现,如0x5530+ 1。您可能会认为这会导致对齐问题,因为指令要么是2字节,要么是4字节对齐。这不是问题,因为处理器将忽略最低有效位。第6部分中的更多细节:条件执行和分支
2. 如果当前程序状态寄存器中的T位被置位,我们知道我们处于Thumb模式。

INTRODUCTION TO ARM INSTRUCTIONS  ARM指令集介绍

The purpose of this part is to briefly introduce into the ARM’s instruction set and it’s general use. It is crucial for us to understand how the smallest piece of the Assembly language operates, how they connect to each other, and what can be achieved by combining them.
这部分的目的是为了简要介绍常用的ARM指令集及它额一般用途。解了汇编语言的最小的一块如何工作,他们之间连接以及通过组合他们可以得到什么对我们而言至关重要。

As mentioned earlier, Assembly language is composed of instructions which are the main building blocks. ARM instructions are usually followed by one or two operands and generally use the following template:
如前所述,汇编语言由指令构成,这些指令是主要的构建块。一条ARM指令后通常跟着一个或两个操作数,通常使用以下模板呈现:
MNEMONIC{S}{condition} {Rd}, Operand1, Operand2

Due to flexibility of the ARM instruction set, not all instructions use all of the fields provided in the template. Nevertheless, the purpose of fields in the template are described as follows:
由于ARM指令集的灵活性,并非所有指令都使用模板中提供的所有字段。然而,模板中字段的目的如下:
MNEMONIC     - Short name (mnemonic) of the instruction
                 //指令集的短名字(助记符)
{S}          - An optional suffix. If S is specified, the condition flags are updated on the result of the operation
                //一个可选的后缀。如果指定了S,则基于操作结果更新条件标志位S。
{condition}  - Condition that is needed to be met in order for the instruction to be executed
                //为了执行指令而需要满足的那些条件
{Rd}         - Register (destination) for storing the result of the instruction
                //存储运算结果的寄存器(目的地)
Operand1     - First operand. Either a register or an immediate value 
                //第一操作数,既可以是寄存器也可以是立即数   
Operand2     - Second (flexible) operand. Can be an immediate value (number) or a register with an optional shift
                //第二操作数(可选的),可以是立即数或具有可选移位的寄存器。

While the MNEMONIC, S, Rd and Operand1 fields are straight forward, the condition and Operand2 fields require a bit more clarification. The condition field is closely tied to the CPSR register’s value, or to be precise, values of specific bits within the register. Operand2 is called a flexible operand, because we can use it in various forms – as immediate value (with limited set of values), register or register with a shift. For example, we can use these expressions as the Operand2:
虽然助记符、S、RD和第一操作数字段都很直观明了,但条件和第二操作数的字段需要更明确一些。条件字段与CPSR寄存器的值紧密相关,或者确切地说,是寄存器内特定比特位的值。第二操作数被称为灵活的操作数,因为我们可以用各种形式将它作为立即数(具有有限的值集)、寄存器或移位寄存器来使用。例如,我们可以使用这些表达式作为第二操作数:
#123                    - Immediate value (with limited set of values). 
                           //立即数(有限的取值范围)
Rx                      - Register x (like R1, R2, R3 ...)
                          //某一寄存器
Rx, ASR n               - Register x with arithmetic shift right by n bits (1 = n = 32)
                          //右移n位的算数运算寄存器(n从1-32)
Rx, LSL n               - Register x with logical shift left by n bits (0 = n = 31)
                          //左移n位的逻辑运算寄存器(n从1-31)
Rx, LSR n               - Register x with logical shift right by n bits (1 = n = 32)
                          //右移n位的逻辑运算寄存器(n从1-32)
Rx, ROR n               - Register x with rotate right by n bits (1 = n = 31)
                          //循环右移n位的寄存器(n从1-31)
Rx, RRX                 - Register x with rotate right by one bit, with extend
                          //循环右移1位的寄存器,带扩展??(这里不确定)

As a quick example of how different kind of instructions look like, let’s take a look at the following list.
下面的例子可以帮助我们快速浏览不同类型指令集看上去有什么不同。我们来看看下表:
ADD   R0, R1, R2      - Adds contents of R1 (Operand1) and R2 (Operand2 in a form of register) and stores the result into R0 (Rd)
                        //将R1里的内容(第一操作数)和R2里的内容相加(寄存器形式的第二操作数)相加,并将结果存储在R0里
ADD   R0, R1, #2      - Adds contents of R1 (Operand1) and the value 2 (Operand2 in a form of an immediate value) and stores the result into R0 (Rd)
                        //将R1里的内容(第一操作数)和立即数2(立即数形式的第二操作数)相加,并将结果存储在R0(Rd)里                        
MOVLE R0, #5          - Moves number 5 (Operand2, because the compiler treats it as MOVLE R0, R0, #5) to R0 (Rd) ONLY if the condition LE (Less Than or Equal) is satisfied
                        //将数字5(第二操作数,因为编译器将它处理为MOVLE R0, R0, #5)传送到R0(Rd),只有LE条件(大于或等于)满足时。
MOV   R0, R1, LSL #1  - Moves the contents of R1 (Operand2 in a form of register with logical shift left) shifted left by one bit to R0 (Rd). So if R1 had value 2, it gets shifted left by one bit and becomes 4. 4 is then moved to R0.
                        //将R1的值(第二操作数,逻辑左移寄存器的形式)左移一位后传送到R0(Rd),所以如果R1里的值是2,它左移一位就变成了4,接下来把4传送到R0里

As a quick summary, let’s take a look at the most common instructions which we will use in future examples.
作为一个快速总结,我们来看将来的例子里最常用的指令有哪些


[培训]《安卓高级研修班(网课)》月薪三万计划

最后于 2018-6-15 17:31 被r0Cat编辑 ,原因:
上传的附件:
收藏
点赞1
打赏
分享
打赏 + 2.00雪花
打赏次数 1 雪花 + 2.00
 
赞赏  junkboy   +2.00 2018/06/15
最新回复 (7)
雪    币: 11716
活跃值: (133)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
junkboy 2018-6-15 17:27
2
0
最后的图片挂了 
雪    币: 8713
活跃值: (8610)
能力值: (RANK:570 )
在线值:
发帖
回帖
粉丝
r0Cat 7 2018-6-15 17:32
3
0
junkboy 最后的图片挂了
马上补上,感谢支持昂
雪    币: 916
活跃值: (3404)
能力值: ( LV8,RANK:120 )
在线值:
发帖
回帖
粉丝
葫芦娃 1 2018-6-15 22:09
4
0
“32位ARM和Thumb指令:32位Thumb指令有一个.w后缀。”
因为今天正好在看这个,补充一下,这个.w其实是给汇编器看的,代表强制使用Thumb-2(也就是32位Thumb)编译这条指令,Instruction  byte中其实并没有与这个相关的标志位。

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204ic/ch04s11s01.html
最后于 2018-6-15 22:10 被葫芦娃编辑 ,原因:
雪    币: 8713
活跃值: (8610)
能力值: (RANK:570 )
在线值:
发帖
回帖
粉丝
r0Cat 7 2018-6-16 08:54
5
0
葫芦娃 “32位ARM和Thumb指令:32位Thumb指令有一个.w后缀。”因为今天正好在看这个,补充一下,这个.w其实是给汇编器看的,代表强制使用Thumb-2(也就是32位Thumb)编译这条指令,In ...
w后缀强行转成32位Thumb指令  .n后缀强制转成16位,后缀加在助记符后。感谢分享,学习了,,,
最后于 2018-6-16 08:57 被r0Cat编辑 ,原因:
雪    币: 416
活跃值: (661)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
zcshou 2018-6-18 10:27
6
0
去原文网站转了一圈,确实不错!话说,文章作者做的那几张图真是必须给赞!
雪    币: 8713
活跃值: (8610)
能力值: (RANK:570 )
在线值:
发帖
回帖
粉丝
r0Cat 7 2018-6-18 11:47
7
0
看那排版,还有那几张图确实不是一般人做出来的,而且人家写的确实简明扼要,条理清晰,应该是大师手笔
最后于 2018-6-18 13:19 被r0Cat编辑 ,原因:
雪    币: 418
活跃值: (1647)
能力值: ( LV3,RANK:20 )
在线值:
发帖
回帖
粉丝
dryzh 2018-6-19 11:11
8
0
amzilun 葫芦娃 “32位ARM和Thumb指令:32位Thumb指令有一个.w后缀。”因为今天正好在看这个,补充一下,这个.w其实是给汇编器看的,代表强制使用Thum ...
我葫芦娃师傅,对arm用户手册倒背如流啊。
游客
登录 | 注册 方可回帖
返回