Make sure you compile and run all the examples in this tutorial in an ARM environment.
Before you start writing your shellcode, make sure you are aware of some basic principles, such as:
在开始编写你的shellcode前,请确保你已经了解一些基本的原则,比如:
1.You want your shellcode to be compact and free of null-bytes
你想让自己的shellcode紧凑,没有空字节。
Reason: We are writing shellcode that we will use to exploit memory corruption vulnerabilities like buffer overflows. Some buffer overflows occur because of the use of the C function ‘strcpy’. Its job is to copy data until it receives a null-byte. We use the overflow to take control over the program flow and if strcpy hits a null-byte it will stop copying our shellcode and our exploit will not work
2.You also want to avoid library calls and absolute memory addresses
你想避免库调用和绝对内存地址
Reason: To make our shellcode as universal as possible, we can’t rely on library calls that require specific dependencies and absolute memory addresses that depend on specific environments.
The Process of writing shellcode involves the following steps:编写shellcode的过程包括以下几个步骤:
1.Knowing what system calls you want to use
2.Figuring out the syscall number and the parameters your chosen syscall function requires
3.De-Nullifying your shellcode
4.Converting your shellcode into a Hex string
1.知道你想使用何种系统调用
2.计算出系统调用数字,以及系统调用函数需要的参数
Looking at this prototype, we can see that we need the following parameters:
Turns out, the system function execve() is being invoked.结果是调用了系统函数execve()
The parameters execve() requires are:
Pointer to a string specifying the path to a binary
argv[] – array of command line variables
envp[] – array of environment variables
Which basically translates to: execve(*filename, *argv[], *envp[]) –> execve(*filename, 0, 0). The system call number of this function can be looked up with the following command:
Invoking system calls on x86 works as follows: First, you PUSH parameters on the stack. Then, the syscall number gets moved into EAX (MOV EAX, syscall_number). And lastly, you invoke the system call with SYSENTER / INT 80.
调用x86上的系统函数时,调用规则如下:首先,将参数压入堆栈。然后,将系统调用号移动到EAX中(MOV EAX,系统调用号)最后,用SYSENTER / INT 80
调用系统调用函数。
On ARM, syscall invocation works a little bit differently:
1.Move parameters into registers – R0, R1, ..
2.Move the syscall number into register R7
mov r7, #<syscall_number>
3.Invoke the system call with
SVC #0 or
SVC #1
4.The return value ends up in R0
而在ARM中,系统调用有一些不同:
1.将参数放入寄存器R0,R1.。。,
2.将系统调用号放入R7
mov r7, #<系统调用号>
3.调动系统函数:
SVC #0 或者
SVC #1
4.将返回值放入R0
This is how it looks like in ARM Assembly (Code uploaded to the azeria-labs Github account):
ARM汇编里看上去是这样的(代码已上传至girhub账号
azeria-labs
)
As you can see in the picture above, we start with pointing R0 to our “/bin/sh” string by using PC-relative addressing (If you can’t remember why the effective PC starts two instructions ahead of the current one, go to ‘Part 2: Data Types and Registers‘ of the assembly basics tutorial and look at part where the PC register is explained along with an example). Then we move 0’s into R1 and R2 and move the syscall number 11 into R7. Looks easy, right? Let’s look at the disassembly of our first attempt using objdump:
One of the techniques we can use to make null-bytes less likely to appear in our shellcode is to use Thumb mode. Using Thumb mode decreases the chances of having null-bytes, because Thumb instructions are 2 bytes long instead of 4. If you went through the ARM Assembly Basics tutorials you know how to switch from ARM to Thumb mode. If you haven’t I encourage you to read the chapter about the branching instructions “B / BX / BLX” in part 6 of the tutorial “Conditional Execution and Branching“.
In our second attempt we use Thumb mode and replace the operations containing #0’s with operations that result in 0’s by subtracting registers from each other or xor’ing them. For example, instead of using “mov r1, #0”, use either “sub r1, r1, r1” (r1 = r1 – r1) or “eor r1, r1, r1” (r1 = r1 xor r1). Keep in mind that since we are now using Thumb mode (2 byte instructions) and our code must be 4 byte aligned, we need to add a NOP at the end (e.g. mov r5, r5).
The result is that we only have one single null-byte that we need to get rid of. The part of our code that’s causing the null-byte is the null-terminated string “/bin/sh\0”. We can solve this issue with the following technique:
Replace “/bin/sh\0” with “/bin/shX”
Use the instruction strb (store byte) in combination with an existing zero-filled register to replace X with a null-byte
The shellcode we created can now be transformed into it’s hexadecimal representation. Before doing that, it is a good idea to check if the shellcode works as a standalone. But there’s a problem: if we compile our assembly file like we would normally do, it won’t work. The reason for this is that we use the strb operation to modify our code section (.text). This requires the code section to be writable and can be achieved by adding the -N flag during the linking process.
It works! Congratulations, you’ve written your first shellcode in ARM assembly.
它生效了!祝贺你,你已经在ARM汇编中编写了第一段代码。
To convert it into hex, use the following commands:
为了将他转换成十六进制,使用如下命令:
Instead of using the hexdump command above, you also do the same with a simple python script:
除了使用上面的hexdump命令之外,您还可以用简单的Python脚本进行同样的操作:
I hope you enjoyed this introduction into writing ARM shellcode. In the next part you will learn how to write shellcode in form of a reverse-shell, which is a little bit more complicated than the example above. After that we will dive into memory corruptions and learn how they occur and how to exploit them using our self-made shellcode.
The prerequisite for this part of the tutorial is a basic understanding of ARM assembly (covered in the first tutorial series “ARM Assembly Basics“). In this part, you will learn how to use your knowledge to create your first simple shellcode in ARM assembly. The examples used in this tutorial are compiled on an ARMv6 32-bit processor. If you don’t have access to an ARM device, you can create your own lab and emulate a Raspberry Pi distro in a VM by following this tutorial:Emulate Raspberry Pi with QEMU.
This tutorial is for people who think beyond running automated shellcode generators and want to learn how to write shellcode in ARM assembly themselves. After all, knowing how it works under the hood and having full control over the result is much more fun than simply running a tool, isn’t it? Writing your own shellcode in assembly is a skill that can turn out to be very useful in scenarios where you need to bypass shellcode-detection algorithms or other restrictions where automated tools could turn out to be insufficient. The good news is, it’s a skill that can be learned quite easily once you are familiar with the process.