-
-
[翻译]fuzzer开发 3:构建 Bochs、MMU 和文件 IO
-
发表于: 2024-9-6 09:08 2493
-
这是关于开发一个使用 Bochs 作为目标执行引擎的快照模糊测试器的一系列博客文章中的下一篇。你可以在 Lucid 仓库中找到模糊测试器和代码。
今天我们将继续我们的模糊测试器开发之旅。上次我们开发了一个上下文切换基础设施的开端,以便我们可以将 Bochs(实际上是一个测试程序)沙盒化,防止其在系统调用期间接触操作系统内核。
在这篇文章中,我们将介绍我们对模糊测试器所做的一些更改和改进,并记录与 Bochs 相关的一些进展。
在发布上篇博客文章后,我从 Fuzzing Discord 上的传奇人物 WorksButNotTested 那里得到了很好的反馈和建议。他告诉我,如果我们完全放弃完整的上下文切换/C-ABI 到系统调用 ABI 寄存器转换例程,并简单地让 Bochs 从 C 调用 Rust 函数来处理系统调用,我们可以大大减少复杂性。回想起来,这个方法非常直观和显而易见,我不得不承认,我有点尴尬没有考虑到这个可能性。
之前,在我们自定义的 Musl 代码中,当程序需要进行带有 6 个参数的系统调用时,会调用一个 C 函数,如下所示:
在上篇博客中,我们将这个函数更改为一个 if/else 语句,如果程序在 Lucid 下运行,我们会在将 C ABI 寄存器重新排列到系统调用寄存器后,调用 Lucid 的上下文切换函数,如下所示:
所以这相当复杂。我非常专注于“Lucid 必须是内核。当用户空间程序执行系统调用时,它们的状态会被保存,并在内核中启动执行”的想法。这导致我误入歧途,因为这种复杂的例程对于我们的目的来说并不需要,我们实际上不是一个内核,我们只是想为一个特定的程序提供系统调用沙盒,而这个程序的行为相当良好。WorksButNotTested 建议我们直接调用一个 Rust 函数,如下所示:
显然,这是一个更简单的解决方案,我们可以避免重新排列寄存器/保存状态/内联汇编等等。为了设置这个函数,我们只需在 Musl 中的 lucid.h
中创建一个新的全局函数指针变量,并在 src/lucid.c
中为它定义一个函数,你可以在仓库中的 Musl 补丁中看到。Rust 端的 g_lucid_syscall
看起来像这样:
我们能够利用 C ABI 的优势,并保持程序通常使用 Musl 的语义。这是一个非常受欢迎的建议,我对最终结果感到非常满意。
在对系统调用进行重构的过程中,我还简化了上下文切换调用约定的工作方式。与其使用 4 个独立的寄存器进行调用约定,我决定只传递一个指向 Lucid 执行上下文的指针,并让 context_switch
函数根据上下文的值自行决定如何表现。本质上,我们将复杂性从调用方移动到了被调用方。这意味着复杂性不会在代码库中反复出现,而是一次性封装在 context_switch
逻辑中。然而,这确实需要一些 hacky/brittle 代码,例如我们必须为 Lucid 执行数据结构硬编码一些结构偏移量,但我认为为大幅降低复杂性付出的小代价是值得的。context_switch
代码已更改为以下内容:
你可以看到,一旦我们进入context_switch函数,我们会在执行任何可能影响它们的操作之前保存CPU标志,然后保存一些我们用作临时寄存器的寄存器。然后,我们可以自由检查context->mode的值,以确定我们处于哪种执行模式。基于该值,我们可以知道要使用哪个寄存器组来保存我们的通用寄存器。因此,是的,我们确实需要硬编码一些偏移量,但我相信总体而言,这是一个更好的API和系统,用于上下文切换被调用者,并且数据结构本身在这一点上应该相对稳定,不需要大量重构。
在上篇博客之后,我引入了 Fault
概念,这是一种错误类,专门用于在上下文切换代码或系统调用处理期间遇到某种错误时使用。这个错误与我们最高级别的错误 LucidErr
不同。当出现这些错误时,它们最终会被传递回 Lucid,以便 Lucid 进行处理。目前,Lucid 将任何 Fault
都视为致命错误。
我们之所以能够将这些错误传递回 Lucid,是因为在启动 Bochs 执行之前,我们现在保存了 Lucid 的状态并切换上下文以开始 Bochs:
我们对执行上下文进行了一些更改,即标记执行模式(Lucid 模式)并设置我们切换上下文的原因(启动 Bochs)。然后在内联汇编中,我们调用执行上下文结构中偏移量为 0 的函数指针:
因此,我们的 Lucid 状态在 context_switch
例程中被保存,然后我们被传递到这段逻辑:
最后,我们调用 jump_to_bochs
:
这种全面的上下文切换方式使我们能够遇到 Fault
,然后将该错误传递回 Lucid 进行处理。在 fault_handler
中,我们在执行上下文中设置 Fault
类型,然后尝试恢复执行回 Lucid:
如你所见,恢复 Lucid 状态并继续执行相当复杂。我们必须处理的一个棘手问题是,目前当发生 Fault
时,我们可能在 Bochs 模式下运行,这意味着我们的堆栈是 Bochs 的堆栈,而不是 Lucid 的。因此,即使这在技术上只是一个上下文切换,我们仍然需要稍微改变顺序,将 Lucid 的已保存状态弹回当前状态并继续执行。现在,当 Lucid 调用进行上下文切换的函数时,它可以通过检查执行上下文中是否记录了 Fault
来简单地检查这些函数的“返回”值,如下所示:
我觉得这很酷!
在开始这个项目时,我对线程局部存储(TLS)真的是一无所知,只知道它是某种每个线程都有的神奇内存区域,做一些事情。这仍然是我全部的知识,只不过现在我看过一些分配该内存并初始化它的代码,这让我更能理解实际发生了什么。当我实现了上面讨论的 Fault
系统后,我注意到 Lucid 在退出时会发生段错误(segfault)。经过一些调试,我发现它正在调用一个函数指针,而该指针指向的是一个无效的地址。这怎么会发生呢?经过一番调查,我注意到就在那个函数调用之前,使用了 fs
寄存器的偏移量从内存中加载了地址。通常,fs
用于访问 TLS。所以那时我强烈怀疑 Bochs 以某种方式破坏了我的 fs
寄存器的值。因此,我通过 grep
快速搜索了 Musl 中的 fs
寄存器访问代码,发现了以下内容:
所以这个函数 __set_thread_area
使用内联系统调用指令调用 arch_prctl
以直接操作 fs
寄存器。这非常有意义,因为如果确实调用了系统调用指令,我们不会通过系统调用沙盒基础设施拦截它,因为我们从未对其进行过检测,我们只检测了 Musl 中 syscall()
函数包装器的内容。因此,这会逃脱我们的沙盒并直接操作 fs
。果然,我发现这个函数在 src/env/__init_tls.c
中的 TLS 初始化期间被调用:
在 __init_tp
函数中我们接收一个指针,然后调用 TP_ADJ
宏对指针进行一些算术运算,并将该值传递给 __set_thread_area
,这样就可以操作 fs
寄存器。那么,如何对其进行沙盒化处理呢?我想避免直接修改 __set_thread_area
中的内联汇编代码,因此我只是更改了源码,让 Musl 使用 syscall()
包装函数,该函数会在后台调用我们经过检测的系统调用函数,如下所示:
现在,我们可以在 Lucid 中拦截这个系统调用,而实际上不做任何操作。只要没有其他直接访问 fs
的情况(尽管可能还会有!),我们应该就没问题了。我还调整了 Musl 代码,这样如果我们在 Lucid 下运行,我们可以通过执行上下文提供一个 TLS 区域,只需创建一个 Musl 所称的 builtin_tls
的模拟区域:
因此,现在当调用 __init_tp
时,它给出的指针指向我们在执行上下文中创建的自己的 TLS 内存块,这样我们现在就可以在 Lucid 中访问 errno
等内容:
例如,现在如果在执行 read
系统调用期间,我们传递了一个 NULL 缓冲区,我们可以从 Lucid 中的系统调用处理程序返回一个错误代码,并适当地设置 errno
:
目前可能仍有其他对 fs
和 gs
的访问尚未沙盒化,但我们还没有开发到那一部分。
我推迟了很长时间才构建和加载 Bochs,因为我想确保上下文切换和系统调用沙盒化的基础已经搭建好。我还担心这会很困难,因为最初构建 --static-pie
的普通 Bochs 对我来说就很困难。为了进一步复杂化 Bochs 的构建,我们需要基于自定义的 Musl 构建 Bochs,这意味着我们需要一个可以忽略常规标准 C 库并使用我们自定义 Musl libc 的编译器。对我来说,这证明是相当繁琐和困难的。一旦成功,我意识到这还不够。Bochs 作为一个 C++ 代码库,还需要访问标准的 C++ 库函数。像之前测试程序那样做是行不通的,因为我们没有一个基于自定义 Musl 构建的 C++ 库可用。
幸运的是,有一个很棒的项目叫做 musl-cross-make 项目,旨在帮助人们从头开始构建自己的 Musl 工具链。这对我们来说是完美的,因为我们需要一个完整的工具链。我们需要支持 C++ 标准库,并且它需要基于我们自定义的 Musl 构建。因此,为此我们使用的是 GNU C++ 库,libstdc++
,它是 gcc 项目的一部分。
musl-cross-make
会下载所有组成工具链的组件,并从头开始创建一个工具链,该工具链将使用一个 Musl libc 和一个基于该 Musl 构建的 libstdc++
。然后,对我们来说,唯一需要做的就是重新编译带有我们 Lucid 自定义补丁的 Musl libc,然后使用工具链编译 --static-pie
的 Bochs。实际上,这很简单:
这是我用来构建 Bochs 的配置文件:
这足以获得我想要开始测试的 Bochs 二进制文件。将来我们可能需要更改这个配置文件,但目前它是有效的。仓库中应该会有更详细的构建说明,并且还会包含已经构建好的 Bochs 二进制文件。
既然我们已经加载并执行了 Bochs,并将其从系统调用中沙盒化,现在我们需要实现几个新的系统调用,比如 brk
、mmap
和 munmap
。我们的测试程序非常简单,因此我们还没有遇到这些系统调用。
这三个系统调用都以某种方式操作内存,因此我决定我们需要实现某种内存管理单元(MMU)。为了尽量保持简单,我决定至少目前我们不考虑释放内存、重用内存或取消映射内存。我们将简单地为 brk
和 mmap
调用预分配两个内存池。我们还可以将 MMU 结构挂载到执行上下文中,以便在系统调用和上下文切换期间始终可以访问它。
到目前为止,Bochs 实际上只关心映射可读/写的内存,这在简化方面对我们有利。因此,为了预先分配内存池,我们在设置 MMU 作为执行上下文初始化例程的一部分时,自己做了一个相当大的 mmap
调用:
处理内存管理系统调用实际上并不太困难,虽然早期有一些陷阱,但我们还是设法相当快地实现了工作。
brk 是一个用于增加程序中数据段大小的系统调用。一个典型的模式是程序会调用 brk(0)
,这会返回当前的程序断点地址,然后如果程序想要 2 页额外的内存,它会调用 brk(base + 0x2000)
,你可以在 Bochs 的 strace
输出中看到这一点:
所以在我们的系统调用处理程序中,我对 brk
有以下逻辑:
这实际上是我们为 Mmu
实现的 update_brk
方法的包装器,所以让我们来看看:
如果我们在 a1
中得到一个 NULL 参数,我们什么都不需要做,当前的 MMU 状态不需要调整,我们只需返回当前的程序断点。如果我们得到一个非 NULL 参数,我们进行合理性检查,以确保我们的 brk
内存池足够大以满足请求,如果可以,我们调整当前的程序断点并将其返回给调用者。
请记住,这非常简单,因为我们已经预先分配了所有内存,因此我们实际上不需要做太多事情,除了调整相当于指示哪些内存有效的偏移量。
mmap 稍微复杂一些,但仍然很容易跟踪。对于 mmap
调用,我们需要跟踪更多的状态,因为实质上发生了我们需要记住的“分配”。大多数 mmap
调用将有一个 NULL 地址参数,因为它们不关心内存映射在虚拟内存中的位置,在这种情况下,我们默认使用为 Mmu 实现的主要方法 do_mmap
:
非常简单,我们进行一些合理性检查,以确保我们的 mmap
内存池有足够的容量来满足分配需求,我们检查其他参数是否符合预期,然后我们简单地更新当前偏移量和下一个偏移量。这样我们就知道下次从哪里分配,同时也可以将当前的分配基址返回给调用者。
还有一种情况是 mmap
会以非 NULL 地址和 MAP_FIXED
标志调用,这意味着地址对调用者很重要,映射应该发生在提供的虚拟地址上。目前,这种情况发生在 Bochs 进程的早期:
对于这个特殊情况我们实际上不需要做任何操作,因为该地址位于 brk
内存池中。我们已经知道这块内存,并且已经创建了它,所以你上面看到的最后一个 mmap
调用对我们来说相当于一个空操作(NOP),我们只需将地址返回给调用者即可。
目前,我们不支持对非 brk
内存池的 MAP_FIXED
调用。
对于 munmap
,我们也将此操作视为 NOP,并向用户返回成功,因为目前我们不关注释放或重用内存。
你可以看到 Bochs 做了相当多的 brk
和 mmap
调用,而我们的模糊测试器现在能够通过我们的 MMU 处理它们:
在处理完 MMU 之后,我们需要一种方法来进行文件输入和输出。Bochs 正在尝试打开其配置文件:
目前我采用的方法是在初始化 Bochs 的执行上下文时,预先读取并将所需文件的内容存储在内存中。这有一些优势,因为我可以设想未来当我们模糊测试某些东西时,Bochs 可能需要对磁盘镜像文件或其他内容进行文件 I/O,而事先将该文件读取到内存中并等待使用会很方便。模拟文件 I/O 系统调用变得非常简单,我们实际上只需要维护一些元数据和文件内容本身:
因此,当 Bochs 请求读取文件并提供文件描述符(fd)时,我们只需检查 FileTable
中是否有正确的文件,然后从 File::contents
缓冲区中读取其内容,并更新游标结构成员以跟踪我们在文件中的当前偏移量。
open
调用基本上只是作为合理性检查,以确保我们知道 Bochs 正在尝试访问什么:
这就是当前文件 I/O 的全部内容。将来,当我们进行快照和重置快照时,需要记住这些,因为文件状态需要差异性地恢复,但这是以后的问题了。
模糊测试器的开发工作仍在继续,我仍然非常享受实现它的过程,特别感谢仓库中提到的所有人的帮助!接下来,我们将不得不选择一个模糊测试目标并让它在 Bochs 中运行。我们将不得不对 Bochs 模拟的系统进行一些改造,使其仅运行我们的目标程序,以便我们可以适当地进行快照和模糊测试,这应该会非常有趣,敬请期待!
本文使用chatGPT-4o翻译而成,如有错误之处,请斧正
原文链接:https://h0mbre.github.io/Loading_Bochs/
static
__inline
long
__syscall6(
long
n,
long
a1,
long
a2,
long
a3,
long
a4,
long
a5,
long
a6)
{
unsigned
long
ret;
register
long
r10 __asm__(
"r10"
) = a4;
register
long
r8 __asm__(
"r8"
) = a5;
register
long
r9 __asm__(
"r9"
) = a6;
__asm__ __volatile__ (
"syscall"
:
"=a"
(ret) :
"a"
(n),
"D"
(a1),
"S"
(a2),
"d"
(a3),
"r"
(r10),
"r"
(r8),
"r"
(r9) :
"rcx"
,
"r11"
,
"memory"
);
return
ret;
}
static
__inline
long
__syscall6(
long
n,
long
a1,
long
a2,
long
a3,
long
a4,
long
a5,
long
a6)
{
unsigned
long
ret;
register
long
r10 __asm__(
"r10"
) = a4;
register
long
r8 __asm__(
"r8"
) = a5;
register
long
r9 __asm__(
"r9"
) = a6;
__asm__ __volatile__ (
"syscall"
:
"=a"
(ret) :
"a"
(n),
"D"
(a1),
"S"
(a2),
"d"
(a3),
"r"
(r10),
"r"
(r8),
"r"
(r9) :
"rcx"
,
"r11"
,
"memory"
);
return
ret;
}
static
__inline
long
__syscall6_original(
long
n,
long
a1,
long
a2,
long
a3,
long
a4,
long
a5,
long
a6)
{
unsigned
long
ret;
register
long
r10 __asm__(
"r10"
) = a4;
register
long
r8 __asm__(
"r8"
) = a5;
register
long
r9 __asm__(
"r9"
) = a6;
__asm__ __volatile__ (
"syscall"
:
"=a"
(ret) :
"a"
(n),
"D"
(a1),
"S"
(a2),
"d"
(a3),
"r"
(r10),
"r"
(r8),
"r"
(r9) :
"rcx"
,
"r11"
,
"memory"
);
return
ret;
}
static
__inline
long
__syscall6(
long
n,
long
a1,
long
a2,
long
a3,
long
a4,
long
a5,
long
a6)
{
if
(!g_lucid_ctx) {
return
__syscall6_original(n, a1, a2, a3, a4, a5, a6); }
register
long
ret;
register
long
r12 __asm__(
"r12"
) = (
size_t
)(g_lucid_ctx->exit_handler);
register
long
r13 __asm__(
"r13"
) = (
size_t
)(&g_lucid_ctx->register_bank);
register
long
r14 __asm__(
"r14"
) = SYSCALL;
register
long
r15 __asm__(
"r15"
) = (
size_t
)(g_lucid_ctx);
__asm__ __volatile__ (
"mov %1, %%rax\n\t"
"mov %2, %%rdi\n\t"
"mov %3, %%rsi\n\t"
"mov %4, %%rdx\n\t"
"mov %5, %%r10\n\t"
"mov %6, %%r8\n\t"
"mov %7, %%r9\n\t"
"call *%%r12\n\t"
"mov %%rax, %0\n\t"
:
"=r"
(ret)
:
"r"
(n),
"r"
(a1),
"r"
(a2),
"r"
(a3),
"r"
(a4),
"r"
(a5),
"r"
(a6),
"r"
(r12),
"r"
(r13),
"r"
(r14),
"r"
(r15)
:
"rax"
,
"rcx"
,
"r11"
,
"memory"
);
return
ret;
}
static
__inline
long
__syscall6_original(
long
n,
long
a1,
long
a2,
long
a3,
long
a4,
long
a5,
long
a6)
{
unsigned
long
ret;
register
long
r10 __asm__(
"r10"
) = a4;
register
long
r8 __asm__(
"r8"
) = a5;
register
long
r9 __asm__(
"r9"
) = a6;
__asm__ __volatile__ (
"syscall"
:
"=a"
(ret) :
"a"
(n),
"D"
(a1),
"S"
(a2),
"d"
(a3),
"r"
(r10),
"r"
(r8),
"r"
(r9) :
"rcx"
,
"r11"
,
"memory"
);
return
ret;
}
static
__inline
long
__syscall6(
long
n,
long
a1,
long
a2,
long
a3,
long
a4,
long
a5,
long
a6)
{
if
(!g_lucid_ctx) {
return
__syscall6_original(n, a1, a2, a3, a4, a5, a6); }
register
long
ret;
register
long
r12 __asm__(
"r12"
) = (
size_t
)(g_lucid_ctx->exit_handler);
register
long
r13 __asm__(
"r13"
) = (
size_t
)(&g_lucid_ctx->register_bank);
register
long
r14 __asm__(
"r14"
) = SYSCALL;
register
long
r15 __asm__(
"r15"
) = (
size_t
)(g_lucid_ctx);
__asm__ __volatile__ (
"mov %1, %%rax\n\t"
"mov %2, %%rdi\n\t"
"mov %3, %%rsi\n\t"
"mov %4, %%rdx\n\t"
"mov %5, %%r10\n\t"
"mov %6, %%r8\n\t"
"mov %7, %%r9\n\t"
"call *%%r12\n\t"
"mov %%rax, %0\n\t"
:
"=r"
(ret)
:
"r"
(n),
"r"
(a1),
"r"
(a2),
"r"
(a3),
"r"
(a4),
"r"
(a5),
"r"
(a6),
"r"
(r12),
"r"
(r13),
"r"
(r14),
"r"
(r15)
:
"rax"
,
"rcx"
,
"r11"
,
"memory"
);
return
ret;
}
static
__inline
long
__syscall6(
long
n,
long
a1,
long
a2,
long
a3,
long
a4,
long
a5,
long
a6)
{
if
(g_lucid_syscall)
return
g_lucid_syscall(g_lucid_ctx, n, a1, a2, a3, a4, a5, a6);
unsigned
long
ret;
register
long
r10 __asm__(
"r10"
) = a4;
register
long
r8 __asm__(
"r8"
) = a5;
register
long
r9 __asm__(
"r9"
) = a6;
__asm__ __volatile__ (
"syscall"
:
"=a"
(ret) :
"a"
(n),
"D"
(a1),
"S"
(a2),
"d"
(a3),
"r"
(r10),
"r"
(r8),
"r"
(r9) :
"rcx"
,
"r11"
,
"memory"
);
return
ret;
}
static
__inline
long
__syscall6(
long
n,
long
a1,
long
a2,
long
a3,
long
a4,
long
a5,
long
a6)
{
if
(g_lucid_syscall)
return
g_lucid_syscall(g_lucid_ctx, n, a1, a2, a3, a4, a5, a6);
unsigned
long
ret;
register
long
r10 __asm__(
"r10"
) = a4;
register
long
r8 __asm__(
"r8"
) = a5;
register
long
r9 __asm__(
"r9"
) = a6;
__asm__ __volatile__ (
"syscall"
:
"=a"
(ret) :
"a"
(n),
"D"
(a1),
"S"
(a2),
"d"
(a3),
"r"
(r10),
"r"
(r8),
"r"
(r9) :
"rcx"
,
"r11"
,
"memory"
);
return
ret;
}
pub
extern
"C"
fn lucid_syscall(contextp: *mut LucidContext, n: usize,
a1: usize, a2: usize, a3: usize, a4: usize, a5: usize, a6: usize)
-> u64
pub
extern
"C"
fn lucid_syscall(contextp: *mut LucidContext, n: usize,
a1: usize, a2: usize, a3: usize, a4: usize, a5: usize, a6: usize)
-> u64
extern
"C"
{ fn context_switch(); }
global_asm!(
".global context_switch"
,
"context_switch:"
,
// Save the CPU flags before we do any operations
"pushfq"
,
// Save registers we use for scratch
"push r14"
,
"push r13"
,
// Determine what execution mode we're in
"mov r14, r15"
,
"add r14, 0x8"
,
// mode is at offset 0x8 from base
"mov r14, [r14]"
,
"cmp r14d, 0x0"
,
"je save_bochs"
,
// We're in Lucid mode so save Lucid GPRs
"save_lucid: "
,
"mov r14, r15"
,
"add r14, 0x10"
,
// lucid_regs is at offset 0x10 from base
"jmp save_gprs"
,
// We're in Bochs mode so save Bochs GPRs
"save_bochs: "
,
"mov r14, r15"
,
"add r14, 0x90"
,
// bochs_regs is at offset 0x90 from base
"jmp save_gprs"
,
extern
"C"
{ fn context_switch(); }
global_asm!(
".global context_switch"
,
"context_switch:"
,
// Save the CPU flags before we do any operations
"pushfq"
,
// Save registers we use for scratch
"push r14"
,
"push r13"
,
// Determine what execution mode we're in
"mov r14, r15"
,
"add r14, 0x8"
,
// mode is at offset 0x8 from base
"mov r14, [r14]"
,
"cmp r14d, 0x0"
,
"je save_bochs"
,
// We're in Lucid mode so save Lucid GPRs
"save_lucid: "
,
"mov r14, r15"
,
"add r14, 0x10"
,
// lucid_regs is at offset 0x10 from base
"jmp save_gprs"
,
// We're in Bochs mode so save Bochs GPRs
"save_bochs: "
,
"mov r14, r15"
,
"add r14, 0x90"
,
// bochs_regs is at offset 0x90 from base
"jmp save_gprs"
,
#[inline(never)]
pub fn start_bochs(context: &mut LucidContext) {
// Set the execution mode and the reason why we're exiting the Lucid VM
context.mode = ExecMode::Lucid;
context.exit_reason = VmExit::StartBochs;
// Set up the calling convention and then start Bochs by context switching
unsafe {
asm!(
"push r15"
,
// Callee-saved register we have to preserve
"mov r15, {0}"
,
// Move context into R15
"call qword ptr [r15]"
,
// Call context_switch
"pop r15"
,
// Restore callee-saved register
in(reg) context as *mut LucidContext,
);
}
}
#[inline(never)]
pub fn start_bochs(context: &mut LucidContext) {
// Set the execution mode and the reason why we're exiting the Lucid VM
context.mode = ExecMode::Lucid;
context.exit_reason = VmExit::StartBochs;
// Set up the calling convention and then start Bochs by context switching
unsafe {
asm!(
"push r15"
,
// Callee-saved register we have to preserve
"mov r15, {0}"
,
// Move context into R15
"call qword ptr [r15]"
,
// Call context_switch
"pop r15"
,
// Restore callee-saved register
in(reg) context as *mut LucidContext,
);
}
}
// Execution context that is passed between Lucid and Bochs that tracks
// all of the mutable state information we need to do context-switching
#[repr(C)]
#[derive(Clone)]
pub
struct
LucidContext {
pub context_switch: usize,
// Address of context_switch()
// Execution context that is passed between Lucid and Bochs that tracks
// all of the mutable state information we need to do context-switching
#[repr(C)]
#[derive(Clone)]
pub
struct
LucidContext {
pub context_switch: usize,
// Address of context_switch()
// Handle Lucid context switches here
if
LucidContext::is_lucid_mode(context) {
match exit_reason {
// Dispatch to Bochs entry point
VmExit::StartBochs => {
jump_to_bochs(context);
},
_ => {
fault!(context, Fault::BadLucidExit);
}
}
}
// Handle Lucid context switches here
if
LucidContext::is_lucid_mode(context) {
match exit_reason {
// Dispatch to Bochs entry point
VmExit::StartBochs => {
jump_to_bochs(context);
},
_ => {
fault!(context, Fault::BadLucidExit);
}
}
}
// Standalone function to literally jump to Bochs entry and provide the stack
// address to Bochs
fn jump_to_bochs(context: *mut LucidContext) {
// RDX: we have to clear this register as the ABI specifies that exit
// hooks are set when rdx is non-null at program start
//
// RAX: arbitrarily used as a jump target to the program entry
//
// RSP: Rust does not allow you to use 'rsp' explicitly with in(), so we
// have to manually set it with a `mov`
//
// R15: holds a pointer to the execution context, if this value is non-
// null, then Bochs learns at start time that it is running under Lucid
//
// We don't really care about execution order as long as we specify clobbers
// with out/lateout, that way the compiler doesn't allocate a register we
// then immediately clobber
unsafe {
asm!(
"xor rdx, rdx"
,
"mov rsp, {0}"
,
"mov r15, {1}"
,
"jmp rax"
,
in(reg) (*context).bochs_rsp,
in(reg) context,
in(
"rax"
) (*context).bochs_entry,
lateout(
"rax"
) _,
// Clobber (inout so no conflict with in)
out(
"rdx"
) _,
// Clobber
out(
"r15"
) _,
// Clobber
);
}
}
// Standalone function to literally jump to Bochs entry and provide the stack
// address to Bochs
fn jump_to_bochs(context: *mut LucidContext) {
// RDX: we have to clear this register as the ABI specifies that exit
// hooks are set when rdx is non-null at program start
//
// RAX: arbitrarily used as a jump target to the program entry
//
// RSP: Rust does not allow you to use 'rsp' explicitly with in(), so we
// have to manually set it with a `mov`
//
// R15: holds a pointer to the execution context, if this value is non-
// null, then Bochs learns at start time that it is running under Lucid
//
// We don't really care about execution order as long as we specify clobbers
// with out/lateout, that way the compiler doesn't allocate a register we
// then immediately clobber
unsafe {
asm!(
"xor rdx, rdx"
,
"mov rsp, {0}"
,
"mov r15, {1}"
,
"jmp rax"
,
in(reg) (*context).bochs_rsp,
in(reg) context,
in(
"rax"
) (*context).bochs_entry,
lateout(
"rax"
) _,
// Clobber (inout so no conflict with in)
out(
"rdx"
) _,
// Clobber
out(
"r15"
) _,
// Clobber
);
}
}
// Where we handle faults that may occur when context-switching from Bochs. We
// just want to make the fault visible to Lucid so we set it in the context,
// then we try to restore Lucid execution from its last-known good state
pub fn fault_handler(contextp: *mut LucidContext, fault: Fault) {
let context = unsafe { &mut *contextp };
match fault {
Fault::Success => context.fault = Fault::Success,
...
}
// Attempt to restore Lucid execution
restore_lucid_execution(contextp);
}
// Where we handle faults that may occur when context-switching from Bochs. We
// just want to make the fault visible to Lucid so we set it in the context,
// then we try to restore Lucid execution from its last-known good state
pub fn fault_handler(contextp: *mut LucidContext, fault: Fault) {
let context = unsafe { &mut *contextp };
match fault {
Fault::Success => context.fault = Fault::Success,
...
}
// Attempt to restore Lucid execution
restore_lucid_execution(contextp);
}
// We use this function to restore Lucid execution to its last known good state
// This is just really trying to plumb up a fault to a level that is capable of
// discerning what action to take. Right now, we probably just call it fatal.
// We don't really deal with double-faults, it doesn't make much sense at the
// moment when a single-fault will likely be fatal already. Maybe later?
fn restore_lucid_execution(contextp: *mut LucidContext) {
let context = unsafe { &mut *contextp };
// Fault should be set, but change the execution mode now since we're
// jumping back to Lucid
context.mode = ExecMode::Lucid;
// Restore extended state
let save_area = context.lucid_save_area;
let save_inst = context.save_inst;
match save_inst {
SaveInst::XSave64 => {
// Retrieve XCR0 value, this will serve as our save mask
let xcr0 = unsafe { _xgetbv(0) };
// Call xrstor to restore the extended state from Bochs save area
unsafe { _xrstor64(save_area as *
const
u8, xcr0); }
},
SaveInst::FxSave64 => {
// Call fxrstor to restore the extended state from Bochs save area
unsafe { _fxrstor64(save_area as *
const
u8); }
},
_ => (),
// NoSave
}
// Next, we need to restore our GPRs. This is kind of different order than
// returning from a successful context switch since normally we'd still be
// using our own stack; however right now, we still have Bochs' stack, so
// we need to recover our own Lucid stack which is saved as RSP in our
// register bank
let lucid_regsp = &context.lucid_regs as *
const
_;
// Move that pointer into R14 and restore our GPRs. After that we have the
// RSP value that we saved when we called into context_switch, this RSP was
// then subtracted from by 0x8 for the pushfq operation that comes right
// after. So in order to recover our CPU flags, we need to manually sub
// 0x8 from the stack pointer. Pop the CPU flags back into place, and then
// return to the last known good Lucid state
unsafe {
asm!(
"mov r14, {0}"
,
"mov rax, [r14 + 0x0]"
,
"mov rbx, [r14 + 0x8]"
,
"mov rcx, [r14 + 0x10]"
,
"mov rdx, [r14 + 0x18]"
,
"mov rsi, [r14 + 0x20]"
,
"mov rdi, [r14 + 0x28]"
,
"mov rbp, [r14 + 0x30]"
,
"mov rsp, [r14 + 0x38]"
,
"mov r8, [r14 + 0x40]"
,
"mov r9, [r14 + 0x48]"
,
"mov r10, [r14 + 0x50]"
,
"mov r11, [r14 + 0x58]"
,
"mov r12, [r14 + 0x60]"
,
"mov r13, [r14 + 0x68]"
,
"mov r15, [r14 + 0x78]"
,
"mov r14, [r14 + 0x70]"
,
"sub rsp, 0x8"
,
"popfq"
,
"ret"
,
in(reg) lucid_regsp,
);
}
}
// We use this function to restore Lucid execution to its last known good state
// This is just really trying to plumb up a fault to a level that is capable of
// discerning what action to take. Right now, we probably just call it fatal.
// We don't really deal with double-faults, it doesn't make much sense at the
// moment when a single-fault will likely be fatal already. Maybe later?
fn restore_lucid_execution(contextp: *mut LucidContext) {
let context = unsafe { &mut *contextp };
// Fault should be set, but change the execution mode now since we're
// jumping back to Lucid
context.mode = ExecMode::Lucid;
// Restore extended state
let save_area = context.lucid_save_area;
let save_inst = context.save_inst;
match save_inst {
SaveInst::XSave64 => {
// Retrieve XCR0 value, this will serve as our save mask
let xcr0 = unsafe { _xgetbv(0) };
// Call xrstor to restore the extended state from Bochs save area
unsafe { _xrstor64(save_area as *
const
u8, xcr0); }
},
SaveInst::FxSave64 => {
// Call fxrstor to restore the extended state from Bochs save area
unsafe { _fxrstor64(save_area as *
const
u8); }
},
_ => (),
// NoSave
}
// Next, we need to restore our GPRs. This is kind of different order than
// returning from a successful context switch since normally we'd still be
// using our own stack; however right now, we still have Bochs' stack, so
// we need to recover our own Lucid stack which is saved as RSP in our
// register bank
let lucid_regsp = &context.lucid_regs as *
const
_;
// Move that pointer into R14 and restore our GPRs. After that we have the
// RSP value that we saved when we called into context_switch, this RSP was
// then subtracted from by 0x8 for the pushfq operation that comes right
// after. So in order to recover our CPU flags, we need to manually sub
// 0x8 from the stack pointer. Pop the CPU flags back into place, and then
// return to the last known good Lucid state
unsafe {
asm!(
"mov r14, {0}"
,
"mov rax, [r14 + 0x0]"
,
"mov rbx, [r14 + 0x8]"
,
"mov rcx, [r14 + 0x10]"
,
"mov rdx, [r14 + 0x18]"
,
"mov rsi, [r14 + 0x20]"
,
"mov rdi, [r14 + 0x28]"
,
"mov rbp, [r14 + 0x30]"
,
"mov rsp, [r14 + 0x38]"
,
"mov r8, [r14 + 0x40]"
,
"mov r9, [r14 + 0x48]"
,
"mov r10, [r14 + 0x50]"
,
"mov r11, [r14 + 0x58]"
,
"mov r12, [r14 + 0x60]"
,
"mov r13, [r14 + 0x68]"
,
"mov r15, [r14 + 0x78]"
,
"mov r14, [r14 + 0x70]"
,
"sub rsp, 0x8"
,
"popfq"
,
"ret"
,
in(reg) lucid_regsp,
);
}
}
// Start executing Bochs
prompt!(
"Starting Bochs..."
);
start_bochs(&mut lucid_context);
// Check to see if any faults occurred during Bochs execution
if
!matches!(lucid_context.fault, Fault::Success) {
fatal!(LucidErr::from_fault(lucid_context.fault));
}
// Start executing Bochs
prompt!(
"Starting Bochs..."
);
start_bochs(&mut lucid_context);
// Check to see if any faults occurred during Bochs execution
if
!matches!(lucid_context.fault, Fault::Success) {
fatal!(LucidErr::from_fault(lucid_context.fault));
}
/* Copyright 2011-2012 Nicholas J. Kain, licensed under standard MIT license */
.text
.global __set_thread_area
.hidden __set_thread_area
.type __set_thread_area,@function
__set_thread_area:
mov %rdi,%rsi
/* shift for syscall */
movl $0x1002,%edi
/* SET_FS register */
movl $158,%eax
/* set fs segment to */
syscall
/* arch_prctl(SET_FS, arg)*/
ret
/* Copyright 2011-2012 Nicholas J. Kain, licensed under standard MIT license */
.text
.global __set_thread_area
.hidden __set_thread_area
.type __set_thread_area,@function
__set_thread_area:
mov %rdi,%rsi
/* shift for syscall */
movl $0x1002,%edi
/* SET_FS register */
movl $158,%eax
/* set fs segment to */
syscall
/* arch_prctl(SET_FS, arg)*/
ret
int
__init_tp(
void
*p)
{
pthread_t td = p;
td->self = td;
int
r = __set_thread_area(TP_ADJ(p));
if
(r < 0)
return
-1;
if
(!r) libc.can_do_threads = 1;
td->detach_state = DT_JOINABLE;
td->tid = __syscall(SYS_set_tid_address, &__thread_list_lock);
td->locale = &libc.global_locale;
td->robust_list.head = &td->robust_list.head;
td->sysinfo = __sysinfo;
td->next = td->prev = td;
return
0;
}
int
__init_tp(
void
*p)
{
pthread_t td = p;
td->self = td;
int
r = __set_thread_area(TP_ADJ(p));
if
(r < 0)
return
-1;
if
(!r) libc.can_do_threads = 1;
td->detach_state = DT_JOINABLE;
td->tid = __syscall(SYS_set_tid_address, &__thread_list_lock);
td->locale = &libc.global_locale;
td->robust_list.head = &td->robust_list.head;
td->sysinfo = __sysinfo;
td->next = td->prev = td;
return
0;
}
#ifndef ARCH_SET_FS
#define ARCH_SET_FS 0x1002
#endif /* ARCH_SET_FS */
int
__init_tp(
void
*p)
{
pthread_t td = p;
td->self = td;
int
r = syscall(SYS_arch_prctl, ARCH_SET_FS, TP_ADJ(p));
//int r = __set_thread_area(TP_ADJ(p));
#ifndef ARCH_SET_FS
#define ARCH_SET_FS 0x1002
#endif /* ARCH_SET_FS */
int
__init_tp(
void
*p)
{
pthread_t td = p;
td->self = td;
int
r = syscall(SYS_arch_prctl, ARCH_SET_FS, TP_ADJ(p));
//int r = __set_thread_area(TP_ADJ(p));
static
struct
builtin_tls {
char
c;
struct
pthread pt;
void
*space[16];
} builtin_tls[1];
static
struct
builtin_tls {
char
c;
struct
pthread pt;
void
*space[16];
} builtin_tls[1];
if
(libc.tls_size >
sizeof
builtin_tls) {
#ifndef SYS_mmap2
#define SYS_mmap2 SYS_mmap
#endif
__asm__ __volatile__ (
"int3"
);
// Added by me just in case
mem = (
void
*)__syscall(
SYS_mmap2,
0, libc.tls_size, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
/* -4095...-1 cast to void * will crash on dereference anyway,
* so don't bloat the init code checking for error codes and
* explicitly calling a_crash(). */
}
else
{
// Check to see if we're running under Lucid or not
if
(!g_lucid_ctx) { mem = builtin_tls; }
else
{ mem = &g_lucid_ctx->tls; }
}
/* Failure to initialize thread pointer is always fatal. */
if
(__init_tp(__copy_tls(mem)) < 0)
a_crash();
if
(libc.tls_size >
sizeof
builtin_tls) {
#ifndef SYS_mmap2
#define SYS_mmap2 SYS_mmap
#endif
__asm__ __volatile__ (
"int3"
);
// Added by me just in case
mem = (
void
*)__syscall(
SYS_mmap2,
0, libc.tls_size, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
/* -4095...-1 cast to void * will crash on dereference anyway,
* so don't bloat the init code checking for error codes and
* explicitly calling a_crash(). */
}
else
{
// Check to see if we're running under Lucid or not
if
(!g_lucid_ctx) { mem = builtin_tls; }
else
{ mem = &g_lucid_ctx->tls; }
}
/* Failure to initialize thread pointer is always fatal. */
if
(__init_tp(__copy_tls(mem)) < 0)
a_crash();
#[repr(C)]
#[derive(Clone)]
pub
struct
Tls {
padding0: [u8; 8],
// char c
padding1: [u8; 52],
// Padding to offset of errno which is 52-bytes
pub
errno
: i32,
padding2: [u8; 144],
// Additional padding to get to 200-bytes total
padding3: [u8; 128],
// 16 void * values
}
#[repr(C)]
#[derive(Clone)]
pub
struct
Tls {
padding0: [u8; 8],
// char c
padding1: [u8; 52],
// Padding to offset of errno which is 52-bytes
pub
errno
: i32,
padding2: [u8; 144],
// Additional padding to get to 200-bytes total
padding3: [u8; 128],
// 16 void * values
}
// Now we need to make sure the buffer passed to read isn't NULL
let buf_p = a2 as *mut u8;
if
buf_p.is_null() {
context.tls.
errno
= libc::EINVAL;
return
-1_i64 as u64;
}
// Now we need to make sure the buffer passed to read isn't NULL
let buf_p = a2 as *mut u8;
if
buf_p.is_null() {
context.tls.
errno
= libc::EINVAL;
return
-1_i64 as u64;
}
#!/bin/sh
CC=
"/home/h0mbre/musl_stuff/musl-cross-make/output/bin/x86_64-linux-musl-gcc"
CXX=
"/home/h0mbre/musl_stuff/musl-cross-make/output/bin/x86_64-linux-musl-g++"
CFLAGS=
"-Wall --static-pie -fPIE"
CXXFLAGS=
"$CFLAGS"
export
CC
export
CXX
export
CFLAGS
export
CXXFLAGS
.
/configure
--
enable
-sb16 \
--
enable
-all-optimizations \
--
enable
-long-phy-address \
--
enable
-a20-pin \
--
enable
-cpu-level=6 \
--
enable
-x86-64 \
--
enable
-vmx=2 \
--
enable
-pci \
--
enable
-usb \
--
enable
-usb-ohci \
--
enable
-usb-ehci \
--
enable
-usb-xhci \
--
enable
-busmouse \
--
enable
-e1000 \
--
enable
-show-ips \
--
enable
-avx \
--with-nogui
#!/bin/sh
CC=
"/home/h0mbre/musl_stuff/musl-cross-make/output/bin/x86_64-linux-musl-gcc"
CXX=
"/home/h0mbre/musl_stuff/musl-cross-make/output/bin/x86_64-linux-musl-g++"
CFLAGS=
"-Wall --static-pie -fPIE"
CXXFLAGS=
"$CFLAGS"
export
CC
export
CXX
export
CFLAGS
export
CXXFLAGS
.
/configure
--
enable
-sb16 \
--
enable
-all-optimizations \
--
enable
-long-phy-address \
--
enable
-a20-pin \
--
enable
-cpu-level=6 \
--
enable
-x86-64 \
--
enable
-vmx=2 \
--
enable
-pci \
--
enable
-usb \
--
enable
-usb-ohci \
--
enable
-usb-ehci \
--
enable
-usb-xhci \
--
enable
-busmouse \
--
enable
-e1000 \
--
enable
-show-ips \
--
enable
-avx \
--with-nogui
// Structure to track memory usage in Bochs
#[derive(Clone)]
pub
struct
Mmu {
pub brk_base: usize,
// Base address of brk region, never changes
pub brk_size: usize,
// Size of the program break region
pub curr_brk: usize,
// The current program break
pub mmap_base: usize,
// Base address of the `mmap` pool
pub mmap_size: usize,
// Size of the `mmap` pool
pub curr_mmap: usize,
// The current `mmap` page base
pub next_mmap: usize,
// The next allocation base address
}
impl Mmu {
pub fn
new
() -> Result<Self, LucidErr> {
// We don't care where it's mapped
let addr = std::ptr::null_mut::<libc::c_void>();
// Straight-forward
let length = (DEFAULT_BRK_SIZE + DEFAULT_MMAP_SIZE) as libc::
size_t
;
// This is normal
let prot = libc::PROT_WRITE | libc::PROT_READ;
// This might change at some point?
let flags = libc::MAP_ANONYMOUS | libc::MAP_PRIVATE;
// No file backing
let fd = -1 as libc::c_int;
// No offset
let offset = 0 as libc::off_t;
// Try to `mmap` this block
let result = unsafe {
libc::mmap(
addr,
length,
prot,
flags,
fd,
offset
)
};
if
result == libc::MAP_FAILED {
return
Err(LucidErr::from(
"Failed `mmap` memory for MMU"
));
}
// Create MMU
Ok(Mmu {
brk_base: result as usize,
brk_size: DEFAULT_BRK_SIZE,
curr_brk: result as usize,
mmap_base: result as usize + DEFAULT_BRK_SIZE,
mmap_size: DEFAULT_MMAP_SIZE,
curr_mmap: result as usize + DEFAULT_BRK_SIZE,
next_mmap: result as usize + DEFAULT_BRK_SIZE,
})
}