[翻译]二进制漏洞利用（二）ARM32位汇编下的TCP Bind shell-智能设备-看雪-安全社区|安全招聘|kanxue.com

[翻译]二进制漏洞利用（二）ARM32位汇编下的TCP Bind shell

发表于: 2019-7-31 22:34 9430

[翻译]二进制漏洞利用（二）ARM32位汇编下的TCP Bind shell

r0Cat

2019-7-31 22:34

9430

TCP Bind Shell in Assembly (ARM 32-bit)

ARM32位汇编下的TCP Bind shell

In this tutorial, you will learn how to write TCP bind shellcode that is free of null bytes and can be used as shellcode for exploitation. When I talk about exploitation, I’m strictly referring to approved and legal vulnerability research. For those of you relatively new to software exploitation, let me tell you that this knowledge can, in fact, be used for good. If I find a software vulnerability like a stack overflow and want to test its exploitability, I need working shellcode. Not only that, I need techniques to use that shellcode in a way that it can be executed despite the security measures in place. Only then I can show the exploitability of this vulnerability and the techniques malicious attackers could be using to take advantage of security flaws.

在本教程中，您将学习如何编写不带空字节的TCP bind shellcode，即一段用于漏洞利用的shellcode。我严肃的指出，当我讲到漏洞利用代码时，它仅指那些被许可的合法的漏洞利用研究的代码。对于软件的漏洞利用代码相对陌生的人，我可以告诉你，它实际可以被用于好的用途。如果我发现一个软件漏洞，比如堆栈溢出，想要测试它的可利用性，我需要使用shellcode，不仅如此，我还需要通过某种技术，能无视系统的安全保护，成功执行这段shellcode，只有这样，我才能展示这个漏洞的可利用性，以及展示恶意攻击者的确能利用这个安全缺陷。

After going through this tutorial, you will not only know how to write shellcode that binds a shell to a local port, but also how to write any shellcode for that matter. To go from bind shellcode to reverse shellcode is just about changing 1-2 functions, some parameters, but most of it is the same. Writing a bind or reverse shell is more difficult than creating a simple execve() shell. If you want to start small, you can learn how to write a simple execve() shell in assembly before diving into this slightly more extensive tutorial. If you need a refresher in Arm assembly, take a look at my ARM Assembly Basics tutorial series, or use this Cheat Sheet:

在阅读本教程之后，您将不仅知道如何编写将shellcode绑定到本地端口，还知道如何编写任何shellcode。从bind shellcode转到reverse shellcode，您只需要更改1-2个函数，一些参数，但大多数参数是相同的。编写bind或reverse shell比创建简单的execve（）shell更困难。如果您想从小处入手，可以先学习如何用汇编编写一个简单的execve()型shell，然后再深入到这个稍微更广泛的教程中。如果你需要温故知新一下ARM汇编，请看我的《ARM汇编基础》系列教程（译者注：该系列我已翻译并刊发在看雪社区），或使用下面的这个备忘清单

Before we start, I’d like to remind you that we’re creating ARM shellcode and therefore need to set up an ARM lab environment if you don’t already have one. You can set it up yourself (Emulate Raspberry Pi with QEMU) or save time and download the ready-made Lab VM I created (ARM Lab VM). Ready?

在我们开始之前，我想提醒您，我们正在创建arm shellcode，因此如果您还没有arm lab环境，就需要建立一个arm lab环境。您可以自行搭建（QEMU模拟器搭建的树莓PI），或者节省时间并下载我创建的现成的实验室虚拟机（arm lab vm）。你准备好了吗？

First of all, what is a bind shell and how does it really work? With a bind shell, you open up a communication port or a listener on the target machine. The listener then waits for an incoming connection, you connect to it, the listener accepts the connection and gives you shell access to the target system.

首先要理解，什么是bind shell以及它如何工作。使用bind shell，可以打开目标计算机上的通信端口或监听器。监听器等待即将到来的连接，您连接到监听器，它会接受您的连接请求，并赋予你在目标系统里的shell访问权限。

This is different from how Reverse Shells work. With a reverse shell, you make the target machine communicate back to your machine. In that case, your machine has a listener port on which it receives the connection back from the target system.

这和Reverse Shells的工作方式不同，reverse shell是让目标机器反向连接到您自己的机器上，该情境下，您自己的机器有一个接受端口，用于接受来自目标系统的反向连接

Both types of shell have their advantages and disadvantages depending on the target environment. It is, for example, more common that the firewall of the target network fails to block outgoing connections than incoming. This means that your bind shell would bind a port on the target system, but since incoming connections are blocked, you wouldn’t be able to connect to it. Therefore, in some scenarios, it is better to have a reverse shell that can take advantage of firewall misconfigurations that allow outgoing connections. If you know how to write a bind shell, you know how to write a reverse shell. There are only a couple of changes necessary to transform your assembly code into a reverse shell once you understand how it is done.

这两类shell都有各自的优缺点，具体取决于目标环境。比如，如果目标网络的防火墙阻止了来自外部的连接，却没能阻止从内部发出的连接，您将无法（从外部）连接到该端口。也因此，在某些情况下，最好有一个反向shell，它可以利用防火墙允许传出连接的错误配置。如果知道如何编写bind shell，那么就知道如何编写reverse shell。只需必要的几处修改就能将您的汇编代码转变成reverse shell，只要您理解其中的原理。

To translate the functionalities of a bind shell into assembly, we first need to get familiar with the process of a bind shell:

1. Create a new TCP socket

2. Bind socket to a local port

3. Listen for incoming connections

4. Accept incoming connection

5. Redirect STDIN, STDOUT and STDERR to a newly created socket from a client

6. Spawn the shell

要将bind shell的函数功能翻译成汇编语言，我们首先要熟悉bind shell的工作流程

1. 创建一个新的TCP socket连接

2. 将套接字绑定到本地端口

3. 监听即将传入的连接

4. 接受连接

5. 将stdin、stdout和stderr重定向到新创建的客户端套接字

This is the C code we will use for our translation.

这是用于转译的C代码

The first step is to identify the necessary system functions, their parameters, and their system call numbers. Looking at the C code above, we can see that we need the following functions: socket, bind, listen, accept, dup2, execve. You can figure out the system call numbers of these functions with the following command:

第一步是搞清楚必要的系统函数、他们的参数和系统调用号。查看上面的C代码，我们可以看到我们需要以下函数：socket、bind、listen、accept、dup2、execve。您可以使用以下命令搞清楚这些函数对应的系统调用号：

If you’re wondering about the value of _NR_SYSCALL_BASE, it’s 0:

_NR_SYSCALL_BASE的值是0，如果你想知道。

These are all the syscall numbers we’ll need:以下是待会我们需要用到的系统调用号

The parameters each function expects can be looked up in the linux man pages, or on w3challs.com.

每个函数的参数都可以通过linux手册或在w3challs.com找到

The next step is to figure out the specific values of these parameters. One way of doing that is to look at a successful bind shell connection using strace. Strace is a tool you can use to trace system calls and monitor interactions between processes and the Linux Kernel. Let’s use strace to test the C version of our bind shell. To reduce the noise, we limit the output to the functions we’re interested in.

接下来需要搞清楚这些参数的具体值。一种方法是使用strace命令查看一个成功的bind shell连接。strace是一个工具，可以用来跟踪系统调用和监视进程与Linux内核之间的交互。让我们使用strace测试C版本的bind shell。为了减少干扰，我们只把输出限制在我们感兴趣的功能上。

（译者注：这里开启第一个中断，执行上述代码）

（译者注：这里开启第二个终端，执行上面的代码）

This is our strace output:这是strace函数的输出

Now we can fill in the gaps and note down the values we’ll need to pass to the functions of our assembly bind shell.

现在我们可以填补空白，记下我们需要传给bind shell中函数的参数的具体值了

In the first stage, we answered the following questions to get everything we need for our assembly program:

1.Which functions do I need?

2.What are the system call numbers of these functions?

3.What are the parameters of these functions?

4.What are the values of these parameters?

阶段一中，我们会回答了以下问题，获得了我们汇编程序所需要的：

1. 需要哪些函数

2. 这些函数的系统调用号

3. 这些函数的参数是什么

4. 这些参数的具体值

This step is about applying this knowledge and translating it to assembly. Split each function into a separate chunk and repeat the following process:

1. Map out which register you want to use for which parameter

2. Figure out how to pass the required values to these registers

1. How to pass an immediate value to a register

2. How to nullify a register without directly moving a #0 into it (we need to avoid null-bytes in our code and must therefore find other ways to nullify a register or a value in memory)

3. How to make a register point to a region in memory which stores constants and strings

3. Use the right system call number to invoke the function and keep track of register content changes

1. Keep in mind that the result of a system call will land in r0, which means that in case you need to reuse the result of that function in another function, you need to save it into another register before invoking the function.

2. Example: host_sockid = socket(2, 1, 0) – the result (host_sockid) of the socket call will land in r0. This result is reused in other functions like listen(host_sockid, 2), and should therefore be preserved in another register.

接下来的步骤是，运用这些知识并将其转换成汇编代码。将每个函数切分成单独的块（译者注：控制流里的块），并且重复下面的步骤：

1. 将寄存器和你想使用的参数建立一一映射关系

2. 搞清楚如何将所需要的值传给相应的寄存器

1. 如何传递立即数给寄存器

2. 如何在不降0传给寄存器的前提下，将寄存器清零（我们需要避免代码中的空字节，因此必须找到其他方法使寄存器或内存中的值清空）

3. 如何让寄存器指向一块存储了字符串和常量的内存区域

3. 使用正确的系统调用号来调用函数，并能持续跟踪寄存器内容的变化

1. 请记住，系统调用的结果会存放到R0，意味着如果你想重在另一个函数中重复利用一个函数的返回值，你需要在调用另一个函数前，将其妥善保管到另一个寄存器中

2. 示例：host_sockid=socket(2,1,0)– socket调用的结果（ host_sockid ）放入R0中。此结果在其他函数如listen(host_sockid,2)中复用，因此应保存在另一个寄存器中。

The first thing you should do to reduce the possibility of encountering null-bytes is to use Thumb mode. In Arm mode, the instructions are 32-bit, in Thumb mode they are 16-bit. This means that we can already reduce the chance of having null-bytes by simply reducing the size of our instructions. To recap how to switch to Thumb mode: ARM instructions must be 4 byte aligned. To change the mode from ARM to Thumb, set the LSB (Least Significant Bit) of the next instruction’s address (found in PC) to 1 by adding 1 to the PC register’s value and saving it to another register. Then use a BX (Branch and eXchange) instruction to branch to this other register containing the address of the next instruction with the LSB set to one, which makes the processor switch to Thumb mode. It all boils down to the following two instructions.

你要做的第一件事就是切换到thumb模式来减少偶然出现的空字节。在ARM模式下，指令是32位的，在Thumb模式下是16位的。这意味着我们可以通过简单地减少指令的大小来减少出现空字节的机会。简要回顾一下如何切换到Thumb模式：ARM指令必须是4字节对齐的，要将模式从ARM更改为Thumb，请将下一条指令地址（在PC中找到）的LSB（最低有效位）设置为1，方法是通过PC寄存器自增1，然后保存到另一个寄存器。然后使用bx（branch and exchange）指令分支到另一个寄存器，该寄存器包含LSB设置为1的下一条指令的地址（译者注：就是说，这个寄存器要保存下一条指令的地址加1的值），从而使处理器切换到Thumb模式。以上操作可归结为以下两条指令：

From here you will be writing Thumb code and will therefore need to indicate this by using the .THUMB directive in your code.从这里开始编写thumb代码，因此需要在你的代码中用.THUMB来标识这一情况

These are the values we need for the socket call parameters:以下是socket函数的参数所需要的值

After setting up the parameters, you invoke the socket system call with the svc instruction. The result of this invocation will be our host_sockid and will end up in r0. Since we need host_sockid later on, let’s save it to r4.

In ARM, you can’t simply move any immediate value into a register. If you’re interested more details about this nuance, there is a section in the Memory Instructions chapter (at the very end).

To check if I can use a certain immediate value, I wrote a tiny script (ugly code, don’t look) called rotator.py.

设置参数后，使用svc指令调用系统调用socket。这个调用的结果将是我们的host_sockid，并将以存入R0结束。因为我们稍后需要host_sockid，我们把它保存到R4。

在ARM模式中，你不能简单地将任何立即数移入寄存器。如果您对这个细微差别感兴趣，那么请看《内存指令》一章中的一小节（在最后）。

为了检验能否使用某个立即数，我写了一个小脚本（很烂，你不许看）叫rotator.py.

Final code snippet:最终代码的一小段

With the first instruction, we store a structure object containing the address family, host port and host address in the literal pool and reference this object with pc-relative addressing. The literal pool is a memory area in the same section (because the literal pool is part of the code) storing constants, strings, or offsets. Instead of calculating the pc-relative offset manually, you can use an ADR instruction with a label. ADR accepts a PC-relative expression, that is, a label with an optional offset where the address of the label is relative to the PC label. Like this:

使用第一条指令，我们将包含地址族、主机端口和主机地址的结构体对象存储在文字池中，并使用PC相对寻址引用该对象（大量关键代码在图片里）。文字池是一块存储了常量，字符串或偏移量的的同一节（因为文本池是代码的一部分）中的内存区域，您可以使用带标签的ADR指令，而不是手动计算PC相对偏移量。ADR指令可以接受PC相对寻址表达式，即一个带有偏移量（可选）的标签，标签的地址是相对于PC标签的（这句话可能翻译不太好）。比如

The next 5 instructions are STRB (store byte) instructions. A STRB instruction stores one byte from a register to a calculated memory region. The syntax [r1, #1] means that we take R1 as the base address and the immediate value (#1) as an offset.

接下来的5条指令是strb（存储字节）指令。strb指令将寄存器中的一个字节存储到经过计算的内存区域。语法[R1，#1]的意思是我们将R1作为基地址，立即数（#1）作为偏移量。

In the first instruction we made R1 point to the memory region where we store the values of the address family AF_INET, the local port we want to use, and the IP address. We could either use a static IP address, or we could specify 0.0.0.0 to make our bind shell listen on all IPs which the target is configured with, making our shellcode more portable. Now, those are a lot of null-bytes.

在第一条指令中，我们让R1指向存储了：地址族AF_INET、要使用的本地端口号，以及IP地址值的一块内存区域。

我们既可以使用静态IP地址，也可以指定0.0.0.0，使bind shell侦听目标机器被配置的所有IP地址，从而使shell代码更具可移植性。现在，这些指令是很多空字节。

Again, the reason we want to get rid of any null-bytes is to make our shellcode usable for exploits that take advantage of memory corruption vulnerabilities that might be sensitive to null-bytes. Some buffer overflows are caused by improper use of functions like ‘strcpy’. The job of strcpy is to copy data until it receives a null-byte. We use the overflow to take control over the program flow and if strcpy hits a null-byte it will stop copying our shellcode and our exploit will not work. With the strb instruction we take a null byte from a register and modify our own code during execution. This way, we don’t actually have a null byte in our shellcode, but dynamically place it there. This requires the code section to be writable and can be achieved by adding the -N flag during the linking process.

同样，我们希望消除任何空字节的原因是使shellcode能成功利用那些可能对空字节敏感的内存损坏漏洞。某些缓冲区溢出是由于不正确地使用“strcpy”等函数而导致的。strcpy的任务是复制数据，直到它收到一个空字节。我们使用缓冲区溢出来控制程序流，但如果strcpy命中一个空字节，它将停止复制shellcode，我们的漏洞将不起作用。使用strb指令，我们将一个从寄存器中获取一个空字节，并在执行期间修改我们自己的代码。这样，我们的shell代码中实际上没有空字节，而是动态地将其放在那里。这要求代码段是可写的，并且可以通过在链接过程中添加-n标志来实现。

For this reason, we code without null-bytes and dynamically put a null-byte in places where it’s necessary. As you can see in the next picture, the IP address we specify is 1.1.1.1 which will be replaced by 0.0.0.0 during execution.

因此，我们不使用空字节进行编码，而是动态地在需要的地方填充空字节。如下图所示，我们指定的IP地址是1.1.1.1，而在执行期间它将被0.0.0.0替换。