AFL二三事 -- 源码分析 1-二进制漏洞-看雪-安全社区|安全招聘|kanxue.com

AFL二三事 -- 源码分析 1

发表于: 2021-9-27 16:48 34564

AFL二三事 -- 源码分析 1

有毒

2021-9-27 16:48

34564

深入分析AFL源码，对理解AFL的设计理念和其中用到的技巧有着巨大的帮助，对于后期进行定制化Fuzzer开发也具有深刻的指导意义。所以，阅读AFL源码是学习AFL必不可少的一个关键步骤。

考虑到AFL源码规模，源码分析部分将分为几期进行。

当别人都要快的时候，你要慢下来。

首先在宏观上看一下AFL的源码结构：

MetricsTreemap-AFL

主要的代码在 afl-fuzz.c 文件中，然后是几个独立功能的实现代码，llvm_mode 和 qemu_mode 的代码量大致相当，所以分析的重点应该还是在AFL的根目录下的几个核心功能的实现上，尤其是 afl-fuzz.c，属于核心中的重点。

各个模块的主要功能和作用：

插桩模块

fuzzer 模块

afl-fuzz.c：fuzzer 实现的核心代码，AFL 的主体。

其他辅助模块

部分头文件说明

afl-gcc 是GCC 或 clang 的一个wrapper（封装），常规的使用方法是在调用 ./configure 时通过 CC 将路径传递给 afl-gcc 或 afl-clang。（对于 C++ 代码，使用 CXX 并将其指向 afl-g++ / afl-clang++。）afl-clang, afl-clang++， afl-g++ 均为指向 afl-gcc 的一个符号链接。

afl-gcc 的主要作用是实现对于关键节点的代码插桩，属于汇编级，从而记录程序执行路径之类的关键信息，对程序的运行情况进行反馈。

在开始函数代码分析前，首先要明确几个关键变量：

main 函数全部逻辑如下：

其中主要有如下三个函数的调用：

20210825115404

这里添加了部分代码打印出传入的参数 arg[0] - arg[7] ，其中一部分是我们指定的参数，另外一部分是自动添加的编译选项（之前的原理文章的插桩部分有简单介绍）。

函数的核心作用：寻找 afl-as

/ Try to find our "fake" GNU assembler in AFL_PATH or at the location derived from argv[0]. If that fails, abort. /

函数内部大概的流程如下（软件自动生成，控制流程图存在误差，但关键逻辑没有问题）：

核心作用：将 argv 拷贝到 u8 **cc_params，然后进行相应的处理。

/ Copy argv to cc_params, making the necessary edits. /

函数内部的大概流程如下：

调用 ch_alloc() 为 cc_params 分配大小为 (argc + 128) * 8 的内存（u8的类型为1byte无符号整数）

检查 argv[0] 中是否存在/，如果不存在则 name = argv[0]，如果存在则一直找到最后一个/，并将其后面的字符串赋值给 name

对比 name和固定字符串afl-clang：

若相同，设置clang_mode = 1，设置环境变量CLANG_ENV_VAR为1

如果不相同，并且是Apple平台，会进入 #ifdef __APPLE__。在Apple平台下，开始对 name 进行对比，并通过 cc_params[0] = getenv("") 对cc_params[0]进行赋值；如果是非Apple平台，对比 name 和固定字符串afl-g++（此处忽略对Java环境的处理过程）：

若相同，则获取环境变量AFL_CXX的值，如果存在，则将该值赋值给cc_params[0]，否则将g++赋值给cc_params[0]；

若不相同，则获取环境变量AFL_CC的值，如果存在，则将该值赋值给cc_params[0]，否则将gcc赋值给cc_params[0]。

进入 while 循环，遍历从argv[1]开始的argv参数：

如果扫描到 -B ，-B选项用于设置编译器的搜索路径，直接跳过。（因为在这之前已经处理过as_path了）；

如果扫描到 -integrated-as，跳过；

如果扫描到 -pipe，跳过；

如果扫描到 -fsanitize=address 和 -fsanitize=memory 告诉 gcc 检查内存访问的错误，比如数组越界之类，设置 asan_set = 1；

如果扫描到 FORTIFY_SOURCE ，设置 fortify_set = 1 。FORTIFY_SOURCE 主要进行缓冲区溢出问题的检查，检查的常见函数有memcpy, mempcpy, memmove, memset, strcpy, stpcpy, strncpy, strcat, strncat, sprintf, vsprintf, snprintf, gets 等；

对 cc_params 进行赋值：cc_params[cc_par_cnt++] = cur;

跳出 while 循环，设置其他参数：

sanitizer相关，通过多个if进行判断：

如果不存在 AFL_USE_ASAN 环境变量，但存在 AFL_USE_MSAN 环境变量，则设置-fsanitize=memory（不能同时指定AFL_USE_ASAN或者AFL_USE_MSAN，也不能同时指定 AFL_USE_MSAN 和 AFL_HARDEN，因为这样运行时速度过慢；

afl-gcc 是 GNU as 的一个wrapper（封装），唯一目的是预处理由 GCC/clang 生成的汇编文件，并注入包含在 afl-as.h 中的插桩代码。使用 afl-gcc / afl-clang 编译程序时，工具链会自动调用它。该wapper的目标并不是为了实现向 .s 或 asm 代码块中插入手写的代码。

experiment/clang_asm_normalize/ 中可以找到可能允许 clang 用户进行手动插入自定义代码的解决方案，GCC并不能实现该功能。

在开始函数代码分析前，首先要明确几个关键变量：

注：如果在参数中没有指明 --m32 或 --m64 ，则默认使用在编译时使用的选项。

main 函数全部逻辑如下：

可以通过在main函数中添加如下代码来打印实际执行的参数：

在插桩完成后，会生成 .s 文件，内容如下（具体的文件位置与设置的环境变量相关）：

add_instrumentation 函数负责处理输入文件，生成 modified_file ，将 instrumentation 插入所有适当的位置。其整体控制流程如下：

整体逻辑看上去有点复杂，但是关键内容并不算很多。在main函数中调用完 edit_params() 函数完成 as_params 参数数组的处理后，进入到该函数。

判断 input_file 是否为空，如果不为空则尝试打开文件获取fd赋值给 inf，失败则抛出异常；input_file 为空则 inf 设置为标准输入；

打开 modified_file ，获取fd赋值给 outfd，失败返回异常；进一步验证该文件是否可写，不可写返回异常；

while 循环读取 inf 指向文件的每一行到 line 数组，每行最多 MAX_LINE = 8192个字节（含末尾的‘\0’），从line数组里将读取到的内容写入到 outf 指向的文件，然后进入到真正的插桩逻辑。这里需要注意的是，插桩只向 .text 段插入，：

首先跳过标签、宏、注释；

这里结合部分关键代码进行解释。需要注意的是，变量 instr_ok 本质上是一个flag，用于表示是否位于.text段。变量设置为1，表示位于 .text 中，如果不为1，则表示不再。于是，如果instr_ok 为1，就会在分支处执行插桩逻辑，否则就不插桩。

首先判断读入的行是否以‘\t’ 开头，本质上是在匹配.s文件中声明的段，然后判断line[1]是否为.：

接下来通过几个 if 判断，来设置一些标志信息，包括 off-flavor assembly，Intel/AT&T的块处理方式、ad-hoc __asm__块的处理方式等；

AFL在插桩时重点关注的内容包括：^main, ^.L0, ^.LBB0_0, ^\tjnz foo （_main函数， gcc和clang下的分支标记，条件跳转分支标记），这些内容通常标志了程序的流程变化，因此AFL会重点在这些位置进行插桩：

对于形如\tj[^m].格式的指令，即条件跳转指令，且R(100)产生的随机数小于插桩密度inst_ratio，直接使用fprintf将trampoline_fmt_64(插桩部分的指令)写入 outf 指向的文件，写入大小为小于 MAP_SIZE的随机数——R(MAP_SIZE)

，然后插桩计数ins_lines加一，continue 跳出，进行下一次遍历；

对于label的相关评估，有一些label可能是一些分支的目的地，需要自己的评判

首先检查该行中是否存在:，然后检查是否以.开始

如果以.开始，则代表想要插桩^.L0:或者 ^.LBB0_0:这样的branch label，即 style jump destination

上述过程完成后，来到 while 循环的下一个循环，在 while 的开头，可以看到对以 defered mode 进行插桩的位置进行了真正的插桩处理：

这里对 instr_ok, instrument_next 变量进行了检验是否为1，而且进一步校验是否位于 .text 段中，且设置了 defered mode 进行插桩，则就进行插桩操作，写入 trampoline_fmt_64/32 。

至此，插桩函数 add_instrumentation 的主要逻辑已梳理完成。

edit_params，该函数主要是设置变量 as_params 的值，以及 use_64bit/modified_file 的值，其整体控制流程如下：

获取环境变量 TMPDIR 和 AFL_AS;

对于 __APPLE_ 宏，如果当前在 clang_mode 且没有设置 AFL_AS 环境变量，会设置 use_clang_mode = 1，并设置 afl-as 为 AFL_CC/AFL_CXX/clang中的一种；

设置 tmp_dir ，尝试获取的环境变量依次为 TEMP, TMP，如果都失败，则直接设置为 /tmp；

调用 ck_alloc() 函数为 as_params 参数数组分配内存，大小为(argc + 32) * 8；

设置 afl-as 路径：as_params[0] = afl_as ? afl_as : (u8*)"as";

设置 as_params[argc] = 0; ，as_par_cnt 初始值为1；

遍历从 argv[1] 到 argv[argc-1] 之前的每个 argv：

开始设置其他参数：

对于 __APPLE__，如果设置了 use_clang_as，则追加 -c -x assembler；

设置 input_file 变量：input_file = argv[argc - 1];，把最后一个参数的值作为 input_file；

如果 input_file 的首字符为-：

如果 input_file 首字符不为-，比较 input_file 和 tmp_dir、/var/tmp 、/tmp/的前 strlen(tmp_dir)/9/5个字节是否相同，如果不相同，就设置 pass_thru 为1；

设置 modified_file：`modified_file = alloc_printf("%s/.afl-%u-%u.s", tmp_dir, getpid(),

trampoline 的含义是“蹦床”，直译过来就是“插桩蹦床”。个人感觉直接使用英文更能表达出其代表的真实含义和作用，可以简单理解为桩代码。

根据前面内容知道，在64位环境下，AFL会插入 trampoline_fmt_64 到文件中，在32位环境下，AFL会插入trampoline_fmt_32 到文件中。trampoline_fmt_64/32定义在 afl-as.h 头文件中：

上面列出的插桩代码与我们在 .s 文件和IDA逆向中看到的插桩代码是一样的：

.s 文件中的桩代码：

IDA逆向中显示的桩代码：

上述代码执行的主要功能包括：

在以上的功能中， __afl_maybe_log 才是核心内容。

从 __afl_maybe_log 函数开始，后续的处理流程大致如下(图片来自ScUpax0s师傅)：

首先对上面流程中涉及到的几个bss段的变量进行简单说明（以64位为例，从main_payload_64中提取）：

说明

以下介绍的指令段均来自于 main_payload_64 。

首先，使用 lahf 指令（加载状态标志位到AH）将EFLAGS寄存器的低八位复制到 AH，被复制的标志位包括：符号标志位（SF）、零标志位（ZF）、辅助进位标志位（AF）、奇偶标志位（PF）和进位标志位（CF），使用该指令可以方便地将标志位副本保存在变量中；

然后，使用 seto 指令溢出置位；

接下来检查共享内存是否进行了设置，判断 __afl_area_ptr 是否为NULL：

该部分的主要作用为初始化 __afl_area_ptr ，且只在运行到第一个桩时进行本次初始化。

首先，如果 __afl_setup_failure 不为0，直接跳转到 __afl_return 返回；

然后，检查 __afl_global_area_ptr 文件指针是否为NULL：

首先，保存所有寄存器的值，包括 xmm 寄存器组；

然后，进行 rsp 的对齐；

然后，获取环境变量 __AFL_SHM_ID，该环境变量保存的是共享内存的ID：

接下来，将 _shmat 返回的共享内存地址存储在 __afl_area_ptr 和 __afl_global_area_ptr 变量中。

后面即开始运行 __afl_forkserver。

这一段实现的主要功能是向 FORKSRV_FD+1 （也就是198+1）号描述符（即状态管道）中写 __afl_temp 中的4个字节，告诉 fork server （将在后续的文章中进行详细解释）已经成功启动。

我们直接看反编译的代码：

这里第一步的异或中的 a4 ，其实是调用 __afl_maybe_log 时传入的参数：

再往上追溯到插桩代码：

可以看到传入 rcx 的，实际上就是用于标记当前桩的随机id，而 _afl_prev_loc 其实是上一个桩的随机id。

经过两次异或之后，再将 _afl_prev_loc 右移一位作为新的 _afl_prev_loc，最后再共享内存中存储当前插桩位置的地方计数加一。

本文综合分析了AFL中的gcc部分和插桩部分的源代码，由衷佩服AFL设计开发者的巧妙思路和高超的开发技巧，不愧是开启了fuzzing新时代的、影响力巨大的fuzz工具。

衷心感谢乐于分享的师傅们，能让我站在巨人的肩膀上。

AFL二三事 —— 源码分析 1
前言
宏观
一、AFL 的 gcc —— afl-gcc.c
1. 概述
2. 源码
1. 关键变量
2. main函数
3. find_as 函数
4. edit_params 函数
二、AFL 的源码插桩 —— afl-as.c
1. 概述
2. 源码
1. 关键变量
2. main函数
3. add_instrumentation函数
4. edit_params函数
3. instrumentation trampoline 和 main_payload
1. trampoline_fmt_64/32
2. __afl_maybe_log
3. __afl_setup
4. __afl_setup_first
5. __afl_forkserver
6. __afl_fork_wait_loop
7. __afl_fork_resume
8. __afl_store
三、总结
参考文献：

static u8*  as_path;                /* Path to the AFL 'as' wrapper，AFL的as的路径      */

static u8** cc_params;              /* Parameters passed to the real CC，CC实际使用的编译器参数 */

static u32  cc_par_cnt = 1;         /* Param count, including argv0 ，参数计数 */

static u8   be_quiet,               /* Quiet mode，静默模式      */

            clang_mode;             /* Invoked as afl-clang*? ，是否使用afl-clang*模式 */
 
# 数据类型说明
# typedef uint8_t  u8;
# typedef uint16_t u16;
# typedef uint32_t u32;

static u8* as_path; /* Path to the AFL 'as' wrapper，AFL的as的路径 */

static u8** cc_params; /* Parameters passed to the real CC，CC实际使用的编译器参数 */

static u32 cc_par_cnt = 1; /* Param count, including argv0 ，参数计数 */

static u8 be_quiet, /* Quiet mode，静默模式 */

clang_mode; /* Invoked as afl-clang*? ，是否使用afl-clang*模式 */

# 数据类型说明

# typedef uint8_t u8;

# typedef uint16_t u16;

# typedef uint32_t u32;

static u8** as_params;          /* Parameters passed to the real 'as'，传递给as的参数   */
 
static u8*  input_file;         /* Originally specified input file ，输入文件     */

static u8*  modified_file;      /* Instrumented file for the real 'as'，as进行插桩处理的文件  */
 
static u8   be_quiet,           /* Quiet mode (no stderr output) ，静默模式，没有标准输出       */

            clang_mode,         /* Running in clang mode?    是否运行在clang模式           */

            pass_thru,          /* Just pass data through?   只通过数据           */

            just_version,       /* Just show version?        只显示版本   */

            sanitizer;          /* Using ASAN / MSAN         是否使用ASAN/MSAN           */
 
static u32  inst_ratio = 100,   /* Instrumentation probability (%)  插桩覆盖率    */

            as_par_cnt = 1;     /* Number of params to 'as'    传递给as的参数数量初始值         */

static u8** as_params; /* Parameters passed to the real 'as'，传递给as的参数 */

static u8* input_file; /* Originally specified input file ，输入文件 */

static u8* modified_file; /* Instrumented file for the real 'as'，as进行插桩处理的文件 */

static u8 be_quiet, /* Quiet mode (no stderr output) ，静默模式，没有标准输出 */

clang_mode, /* Running in clang mode? 是否运行在clang模式 */

pass_thru, /* Just pass data through? 只通过数据 */

just_version, /* Just show version? 只显示版本 */

sanitizer; /* Using ASAN / MSAN 是否使用ASAN/MSAN */

static u32 inst_ratio = 100, /* Instrumentation probability (%) 插桩覆盖率 */

as_par_cnt = 1; /* Number of params to 'as' 传递给as的参数数量初始值 */

print("\n");
 
for (int i = 0; i < sizeof(as_params); i++){

  peinrf("as_params[%d]:%s\n", i, as_params[i]);
 
}

print("\n");

for (int i = 0; i < sizeof(as_params); i++){

peinrf("as_params[%d]:%s\n", i, as_params[i]);

}

if (line[0] == '\t' && line[1] == '.') {
 
      /* OpenBSD puts jump tables directly inline with the code, which is

         a bit annoying. They use a specific format of p2align directives

         around them, so we use that as a signal. */
 
      if (!clang_mode && instr_ok && !strncmp(line + 2, "p2align ", 8) &&

          isdigit(line[10]) && line[11] == '\n') skip_next_label = 1;
 
      if (!strncmp(line + 2, "text\n", 5) ||

          !strncmp(line + 2, "section\t.text", 13) ||

          !strncmp(line + 2, "section\t__TEXT,__text", 21) ||

          !strncmp(line + 2, "section __TEXT,__text", 21)) {

        instr_ok = 1;

        continue; 

      }
 
      if (!strncmp(line + 2, "section\t", 8) ||

          !strncmp(line + 2, "section ", 8) ||

          !strncmp(line + 2, "bss\n", 4) ||

          !strncmp(line + 2, "data\n", 5)) {

        instr_ok = 0;

        continue;

      }
 
    }

if (line[0] == '\t' && line[1] == '.') {

/* OpenBSD puts jump tables directly inline with the code, which is

a bit annoying. They use a specific format of p2align directives

around them, so we use that as a signal. */

if (!clang_mode && instr_ok && !strncmp(line + 2, "p2align ", 8) &&

isdigit(line[10]) && line[11] == '\n') skip_next_label = 1;

if (!strncmp(line + 2, "text\n", 5) ||

!strncmp(line + 2, "section\t.text", 13) ||

!strncmp(line + 2, "section\t__TEXT,__text", 21) ||

!strncmp(line + 2, "section __TEXT,__text", 21)) {

instr_ok = 1;

continue;

}

if (!strncmp(line + 2, "section\t", 8) ||

!strncmp(line + 2, "section ", 8) ||

!strncmp(line + 2, "bss\n", 4) ||

!strncmp(line + 2, "data\n", 5)) {

instr_ok = 0;

continue;

}

/* Detect off-flavor assembly (rare, happens in gdb). When this is

   encountered, we set skip_csect until the opposite directive is

   seen, and we do not instrument. */
 
if (strstr(line, ".code")) {
 
  if (strstr(line, ".code32")) skip_csect = use_64bit;

  if (strstr(line, ".code64")) skip_csect = !use_64bit;
 
}
 
/* Detect syntax changes, as could happen with hand-written assembly.

   Skip Intel blocks, resume instrumentation when back to AT&T. */
 
if (strstr(line, ".intel_syntax")) skip_intel = 1;

if (strstr(line, ".att_syntax")) skip_intel = 0;
 
/* Detect and skip ad-hoc __asm__ blocks, likewise skipping them. */
 
if (line[0] == '#' || line[1] == '#') {
 
  if (strstr(line, "#APP")) skip_app = 1;

  if (strstr(line, "#NO_APP")) skip_app = 0;
 
}

/* Detect off-flavor assembly (rare, happens in gdb). When this is

encountered, we set skip_csect until the opposite directive is

seen, and we do not instrument. */

if (strstr(line, ".code")) {

if (strstr(line, ".code32")) skip_csect = use_64bit;

if (strstr(line, ".code64")) skip_csect = !use_64bit;

}

/* Detect syntax changes, as could happen with hand-written assembly.

Skip Intel blocks, resume instrumentation when back to AT&T. */

if (strstr(line, ".intel_syntax")) skip_intel = 1;

if (strstr(line, ".att_syntax")) skip_intel = 0;

/* Detect and skip ad-hoc __asm__ blocks, likewise skipping them. */

if (line[0] == '#' || line[1] == '#') {

if (strstr(line, "#APP")) skip_app = 1;

if (strstr(line, "#NO_APP")) skip_app = 0;

}

/* If we're in the right mood for instrumenting, check for function

   names or conditional labels. This is a bit messy, but in essence,

   we want to catch:
 
     ^main:      - function entry point (always instrumented)

     ^.L0:       - GCC branch label

     ^.LBB0_0:   - clang branch label (but only in clang mode)

     ^\tjnz foo  - conditional branches
 
   ...but not:
 
     ^# BB#0:    - clang comments

     ^ # BB#0:   - ditto

     ^.Ltmp0:    - clang non-branch labels

     ^.LC0       - GCC non-branch labels

     ^.LBB0_0:   - ditto (when in GCC mode)

     ^\tjmp foo  - non-conditional jumps
 
   Additionally, clang and GCC on MacOS X follow a different convention

   with no leading dots on labels, hence the weird maze of #ifdefs

   later on.
 
 */
 
if (skip_intel || skip_app || skip_csect || !instr_ok ||

    line[0] == '#' || line[0] == ' ') continue;
 
/* Conditional branch instruction (jnz, etc). We append the instrumentation

   right after the branch (to instrument the not-taken path) and at the

   branch destination label (handled later on). */
 
if (line[0] == '\t') {
 
  if (line[1] == 'j' && line[2] != 'm' && R(100) < inst_ratio) {
 
    fprintf(outf, use_64bit ? trampoline_fmt_64 : trampoline_fmt_32,

            R(MAP_SIZE));
 
    ins_lines++;
 
  }
 
  continue;
 
}

/* If we're in the right mood for instrumenting, check for function

names or conditional labels. This is a bit messy, but in essence,

we want to catch:

^main: - function entry point (always instrumented)

^.L0: - GCC branch label

^.LBB0_0: - clang branch label (but only in clang mode)

^\tjnz foo - conditional branches

...but not:

^# BB#0: - clang comments

^ # BB#0: - ditto

^.Ltmp0: - clang non-branch labels

^.LC0 - GCC non-branch labels

^.LBB0_0: - ditto (when in GCC mode)

^\tjmp foo - non-conditional jumps

Additionally, clang and GCC on MacOS X follow a different convention

with no leading dots on labels, hence the weird maze of #ifdefs

later on.

*/

if (skip_intel || skip_app || skip_csect || !instr_ok ||

line[0] == '#' || line[0] == ' ') continue;

/* Conditional branch instruction (jnz, etc). We append the instrumentation

right after the branch (to instrument the not-taken path) and at the

branch destination label (handled later on). */

if (line[0] == '\t') {

if (line[1] == 'j' && line[2] != 'm' && R(100) < inst_ratio) {

fprintf(outf, use_64bit ? trampoline_fmt_64 : trampoline_fmt_32,

R(MAP_SIZE));

ins_lines++;

}

continue;

}

    /* Label of some sort. This may be a branch destination, but we need to

       tread carefully and account for several different formatting

       conventions. */
 
#ifdef __APPLE__
 
    /* Apple: L<whatever><digit>: */
 
    if ((colon_pos = strstr(line, ":"))) {
 
      if (line[0] == 'L' && isdigit(*(colon_pos - 1))) {
 
#else
 
    /* Everybody else: .L<whatever>: */
 
    if (strstr(line, ":")) {
 
      if (line[0] == '.') {
 
#endif /* __APPLE__ */
 
        /* .L0: or LBB0_0: style jump destination */
 
#ifdef __APPLE__
 
        /* Apple: L<num> / LBB<num> */
 
        if ((isdigit(line[1]) || (clang_mode && !strncmp(line, "LBB", 3)))

            && R(100) < inst_ratio) {
 
#else
 
        /* Apple: .L<num> / .LBB<num> */
 
        if ((isdigit(line[2]) || (clang_mode && !strncmp(line + 1, "LBB", 3)))

            && R(100) < inst_ratio) {
 
#endif /* __APPLE__ */
 
          /* An optimization is possible here by adding the code only if the

             label is mentioned in the code in contexts other than call / jmp.

             That said, this complicates the code by requiring two-pass

             processing (messy with stdin), and results in a speed gain

             typically under 10%, because compilers are generally pretty good

             about not generating spurious intra-function jumps.
 
             We use deferred output chiefly to avoid disrupting

             .Lfunc_begin0-style exception handling calculations (a problem on

             MacOS X). */
 
          if (!skip_next_label) instrument_next = 1; else skip_next_label = 0;
 
        }
 
      } else {
 
        /* Function label (always instrumented, deferred mode). */
 
        instrument_next = 1;
 
      }

    }

  }

/* Label of some sort. This may be a branch destination, but we need to

tread carefully and account for several different formatting

conventions. */

#ifdef __APPLE__

/* Apple: L<whatever><digit>: */

if ((colon_pos = strstr(line, ":"))) {

if (line[0] == 'L' && isdigit(*(colon_pos - 1))) {

#else

/* Everybody else: .L<whatever>: */

if (strstr(line, ":")) {

if (line[0] == '.') {

#endif /* __APPLE__ */

/* .L0: or LBB0_0: style jump destination */

#ifdef __APPLE__

/* Apple: L<num> / LBB<num> */

if ((isdigit(line[1]) || (clang_mode && !strncmp(line, "LBB", 3)))

&& R(100) < inst_ratio) {

#else

/* Apple: .L<num> / .LBB<num> */

if ((isdigit(line[2]) || (clang_mode && !strncmp(line + 1, "LBB", 3)))

&& R(100) < inst_ratio) {

#endif /* __APPLE__ */

/* An optimization is possible here by adding the code only if the

label is mentioned in the code in contexts other than call / jmp.

That said, this complicates the code by requiring two-pass

processing (messy with stdin), and results in a speed gain

typically under 10%, because compilers are generally pretty good

about not generating spurious intra-function jumps.

We use deferred output chiefly to avoid disrupting

.Lfunc_begin0-style exception handling calculations (a problem on

MacOS X). */

if (!skip_next_label) instrument_next = 1; else skip_next_label = 0;

}

} else {

/* Function label (always instrumented, deferred mode). */

instrument_next = 1;

}

if (!pass_thru && !skip_intel && !skip_app && !skip_csect && instr_ok &&

    instrument_next && line[0] == '\t' && isalpha(line[1])) {
 
  fprintf(outf, use_64bit ? trampoline_fmt_64 : trampoline_fmt_32,

          R(MAP_SIZE));
 
  instrument_next = 0;

  ins_lines++;
 
}

if (!pass_thru && !skip_intel && !skip_app && !skip_csect && instr_ok &&

instrument_next && line[0] == '\t' && isalpha(line[1])) {

fprintf(outf, use_64bit ? trampoline_fmt_64 : trampoline_fmt_32,

R(MAP_SIZE));

instrument_next = 0;

ins_lines++;

}

(u32)time(NULL));`，即为`tmp_dir/afl-pid-tim.s` 格式的字符串

static const u8* trampoline_fmt_32 =
 
  "\n"

  "/* --- AFL TRAMPOLINE (32-BIT) --- */\n"

  "\n"

  ".align 4\n"

  "\n"

  "leal -16(%%esp), %%esp\n"

  "movl %%edi,  0(%%esp)\n"

  "movl %%edx,  4(%%esp)\n"

  "movl %%ecx,  8(%%esp)\n"

  "movl %%eax, 12(%%esp)\n"

  "movl $0x%08x, %%ecx\n"    // 向ecx中存入识别代码块的随机桩代码id

  "call __afl_maybe_log\n"   // 调用 __afl_maybe_log 函数

  "movl 12(%%esp), %%eax\n"

  "movl  8(%%esp), %%ecx\n"

  "movl  4(%%esp), %%edx\n"

  "movl  0(%%esp), %%edi\n"

  "leal 16(%%esp), %%esp\n"

  "\n"

  "/* --- END --- */\n"

  "\n";
 
static const u8* trampoline_fmt_64 =
 
  "\n"

  "/* --- AFL TRAMPOLINE (64-BIT) --- */\n"

  "\n"

  ".align 4\n"

  "\n"

  "leaq -(128+24)(%%rsp), %%rsp\n"

  "movq %%rdx,  0(%%rsp)\n"

  "movq %%rcx,  8(%%rsp)\n"

  "movq %%rax, 16(%%rsp)\n"

  "movq $0x%08x, %%rcx\n"  // 64位下使用的寄存器为rcx

  "call __afl_maybe_log\n" // 调用 __afl_maybe_log 函数

  "movq 16(%%rsp), %%rax\n"

  "movq  8(%%rsp), %%rcx\n"

  "movq  0(%%rsp), %%rdx\n"

  "leaq (128+24)(%%rsp), %%rsp\n"

  "\n"

  "/* --- END --- */\n"

  "\n";

static const u8* trampoline_fmt_32 =

"\n"

"/* --- AFL TRAMPOLINE (32-BIT) --- */\n"

"\n"

".align 4\n"

"\n"

"leal -16(%%esp), %%esp\n"

"movl %%edi, 0(%%esp)\n"

"movl %%edx, 4(%%esp)\n"

"movl %%ecx, 8(%%esp)\n"

"movl %%eax, 12(%%esp)\n"

"movl $0x%08x, %%ecx\n" // 向ecx中存入识别代码块的随机桩代码id

"call __afl_maybe_log\n" // 调用 __afl_maybe_log 函数

"movl 12(%%esp), %%eax\n"

"movl 8(%%esp), %%ecx\n"

"movl 4(%%esp), %%edx\n"

"movl 0(%%esp), %%edi\n"

"leal 16(%%esp), %%esp\n"

"\n"

"/* --- END --- */\n"

"\n";

static const u8* trampoline_fmt_64 =

"\n"

"/* --- AFL TRAMPOLINE (64-BIT) --- */\n"

"\n"

".align 4\n"

"\n"

"leaq -(128+24)(%%rsp), %%rsp\n"

"movq %%rdx, 0(%%rsp)\n"

"movq %%rcx, 8(%%rsp)\n"

"movq %%rax, 16(%%rsp)\n"

"movq $0x%08x, %%rcx\n" // 64位下使用的寄存器为rcx

"call __afl_maybe_log\n" // 调用 __afl_maybe_log 函数

"movq 16(%%rsp), %%rax\n"

"movq 8(%%rsp), %%rcx\n"

"movq 0(%%rsp), %%rdx\n"

"leaq (128+24)(%%rsp), %%rsp\n"

"\n"

"/* --- END --- */\n"

"\n";

.AFL_VARS:
 
  .comm   __afl_area_ptr, 8

  .comm   __afl_prev_loc, 8

  .comm   __afl_fork_pid, 4

  .comm   __afl_temp, 4

  .comm   __afl_setup_failure, 1

  .comm    __afl_global_area_ptr, 8, 8

.AFL_VARS:

.comm __afl_area_ptr, 8

.comm __afl_prev_loc, 8

.comm __afl_fork_pid, 4

.comm __afl_temp, 4

.comm __afl_setup_failure, 1

.comm __afl_global_area_ptr, 8, 8

__afl_maybe_log:   /* 源码删除无关内容后 */
 
  lahf

  seto  %al
 
  /* Check if SHM region is already mapped. */
 
  movq  __afl_area_ptr(%rip), %rdx

  testq %rdx, %rdx

  je    __afl_setup

__afl_maybe_log: /* 源码删除无关内容后 */

lahf

seto %al

/* Check if SHM region is already mapped. */

movq __afl_area_ptr(%rip), %rdx

testq %rdx, %rdx

je __afl_setup

登录后可查看完整内容

[培训]内核驱动高级班，冲击BAT一流互联网大厂工作，每周日13:00-18:00直播授课

最后于 2021-9-27 17:00 被有毒编辑，原因：

#自动化挖掘 #Fuzz #Linux

收藏・32

免费・8

支持

赞赏记录

参与人

雪币

留言

时间

a39

为你点赞~

2023-5-8 20:31

1Oin0

为你点赞~

2022-7-29 23:59

PLEBFE

为你点赞~

2022-7-28 00:13

34r7hm4n

为你点赞~

2021-9-30 09:25

jmpcall

为你点赞~

2021-9-29 13:41

zhczf

为你点赞~

2021-9-29 11:02

erfze

为你点赞~

2021-9-28 21:47

pureGavin

为你点赞~

2021-9-27 22:33

最新回复 (3)
pureGavin 雪币： 15228 活跃值： (18608) 能力值： ( LV12，RANK：290 ) 在线值：发帖 94 回帖 1434 粉丝 292 关注私信	pureGavin 3 2 楼老毒一出手，就知有没有 2021-9-27 22:33 0
kanxue 雪币： 55923 活跃值： (21565) 能力值： (RANK：350 ) 在线值：发帖 2382 回帖 17061 粉丝 575 关注私信	kanxue 8 3 楼厉害！感谢分享！ 2021-9-27 22:40 0
有毒雪币： 15807 活跃值： (17002) 能力值： (RANK：730 ) 在线值：发帖 56 回帖 529 粉丝 342 关注私信	有毒 10 4 楼 pureGavin 老毒一出手，就知有没有你这个可太秀了 2021-9-28 10:02 0
	游客登录 \| 注册方可回帖回帖表情雪币赚取及消费高级回复

有毒

发帖

529

回帖

730

RANK

关注

私信

他的文章

关于我们

联系我们

企业服务

看雪公众号

最新回复 (3)
pureGavin 雪币： 15228 活跃值： (18608) 能力值： ( LV12，RANK：290 ) 在线值：发帖 94 回帖 1434 粉丝 292 关注私信	pureGavin 3 2 楼老毒一出手，就知有没有 2021-9-27 22:33 0
kanxue 雪币： 55923 活跃值： (21565) 能力值： (RANK：350 ) 在线值：发帖 2382 回帖 17061 粉丝 575 关注私信	kanxue 8 3 楼厉害！感谢分享！ 2021-9-27 22:40 0
有毒雪币： 15807 活跃值： (17002) 能力值： (RANK：730 ) 在线值：发帖 56 回帖 529 粉丝 342 关注私信	有毒 10 4 楼 pureGavin 老毒一出手，就知有没有你这个可太秀了 2021-9-28 10:02 0
	游客登录 \| 注册方可回帖回帖表情雪币赚取及消费高级回复

AFL二三事 -- 源码分析 1

账号登录 验证码登录

账号登录

验证码登录