[原创]syzkaller源码分析（三） executeHintSeed() Mutate() Generate()-二进制漏洞-看雪-安全社区|安全招聘|kanxue.com

[原创]syzkaller源码分析（三） executeHintSeed() Mutate() Generate()
发表于: 2021-6-24 20:00 9095
[原创]syzkaller源码分析（三） executeHintSeed() Mutate() Generate()

wx_左撇子
2021-6-24 20:00
9095
fuzz大致的流程在上一篇中已经做了说明，这一篇主要来介绍三个函数：executeHintSeed()、Mutate()和Generate()。
水平有限，有错误的话欢迎指正。自己看的时候花了一些时间，希望对以后也要看syzkaller代码的人能有些许帮助，不过也不知道有没有人能看到这里哈哈哈 :)。
前两篇的链接如下：
syz-manager：https://bbs.pediy.com/thread-268152.htm
syz-fuzzer：https://bbs.pediy.com/thread-268195.htm
md文件会附在最后。
参考：https://xz.aliyun.com/t/5401
https://xz.aliyun.com/t/5223
但是看的时候费了些功夫，就写在这里了。
下面进入函数MutateWithHints()
　　这个函数涉及到的内容基本都在syzkaller/prog/hints.go中，所以有必要先介绍一下hint。
　　在hints.go文件的开头有一段介绍，现在贴在这里。
重点函数 shrinkExpand()
这个函数不太好介绍，就直接上案例，不写代码了。
Generate()用来生成一个新的程序
 
executeHintSeed()在fuzzer.go的loop()下的smashInput()被调用，属于一个小环节，
executeHintSeed()在fuzzer.go的loop()下的smashInput()被调用，属于一个小环节，

func (proc *Proc) executeHintSeed(p *prog.Prog, call int) {

    //原注释如下:

    //First execute the original program to dump comparisons from KCOV.

    //执行原始的program，把从KCOV获取到的覆盖情况转储到comparisons中。

    info := proc.execute(proc.execOptsComps, p, ProgNormal, StatSeed)
 
    /*

        原注释如下：

        // Then mutate the initial program for every match between

        // a syscall argument and a comparison operand.

        // Execute each of such mutants to check if it gives new coverage.

        然后，对初始program的每一个可以匹配成功的系统调用参数和比较操作数进行变异。

    执行每一次变异后的程序看是否出现新的覆盖情况。

        其中，参数info.Calls[call].Comps表示per-call comparison operands，即

    每一个call的比较操作数。它的数据类型如下：map[uint64]map[uint64]bool，下面是源

    代码给出的例子，把原本的comparisons进行整理，对于每一个match，以前面的值为键，后面

    的值+true为值。
 
    // Example: for comparisons {(op1, op2), (op1, op3), (op1, op4), (op2, op1)}

    // this map will store the following:

    // m = {

    //        op1: {map[op2]: true, map[op3]: true, map[op4]: true},

    //        op2: {map[op1]: true}

    // }.
 
        第三个参数传入的是一个func，用来执行prog，后面将使用到的几个函数也会把func作为参数。

    */

    p.MutateWithHints(call, info.Calls[call].Comps, func(p *prog.Prog) {

        proc.execute(proc.execOpts, p, ProgNormal, StatHint)

    })
 
}

func (proc *Proc) executeHintSeed(p *prog.Prog, call int) {

    //原注释如下:

    //First execute the original program to dump comparisons from KCOV.

    //执行原始的program，把从KCOV获取到的覆盖情况转储到comparisons中。

    info := proc.execute(proc.execOptsComps, p, ProgNormal, StatSeed)
 
    /*

        原注释如下：

        // Then mutate the initial program for every match between

        // a syscall argument and a comparison operand.

        // Execute each of such mutants to check if it gives new coverage.

        然后，对初始program的每一个可以匹配成功的系统调用参数和比较操作数进行变异。

    执行每一次变异后的程序看是否出现新的覆盖情况。

        其中，参数info.Calls[call].Comps表示per-call comparison operands，即

    每一个call的比较操作数。它的数据类型如下：map[uint64]map[uint64]bool，下面是源

    代码给出的例子，把原本的comparisons进行整理，对于每一个match，以前面的值为键，后面

    的值+true为值。
 
    // Example: for comparisons {(op1, op2), (op1, op3), (op1, op4), (op2, op1)}

    // this map will store the following:

    // m = {

    //        op1: {map[op2]: true, map[op3]: true, map[op4]: true},

    //        op2: {map[op1]: true}

    // }.
 
        第三个参数传入的是一个func，用来执行prog，后面将使用到的几个函数也会把func作为参数。

    */

    p.MutateWithHints(call, info.Calls[call].Comps, func(p *prog.Prog) {

        proc.execute(proc.execOpts, p, ProgNormal, StatHint)

    })
 
}
 
// A hint is basically a tuple consisting of a pointer to an argument

// in one of the syscalls of a program and a value, which should be

// assigned to that argument (we call it a replacer).
 
// A simplified version of hints workflow looks like this:

//        1. Fuzzer launches a program (we call it a hint seed) and collects all

// the comparisons' data for every syscall in the program.

//        2. Next it tries to match the obtained comparison operands' values

// vs. the input arguments' values.

//        3. For every such match the fuzzer mutates the program by

// replacing the pointed argument with the saved value.

//        4. If a valid program is obtained, then fuzzer launches it and

// checks if new coverage is obtained.

// For more insights on particular mutations please see prog/hints_test.go.
 
/*

    简单解释一下：

    一个hint是一个元组，它由一个指向syscall的一个参数的指针和一个value组成，这个值应当被
赋予到对应的参数上，在syzkaller中被称作一个replacer。

    一个简单的hints的工作流程如下：

    1、Fuzzer启动一个程序(这个程序被称为hint种子)并且收集这个程序中每一个syscall的比较数据。

    2、下一步Fuzzer尝试把获得的比较操作数与输入的参数值进行匹配。

    3、对于每一对匹配成功的值，fuzzer通过替换对应指针保存的值来对程序进行变异。

    4、如果能获得一个有效的程序，然后fuzzer启动程序，检查有没有新的覆盖情况生成。

*/

// A hint is basically a tuple consisting of a pointer to an argument

// in one of the syscalls of a program and a value, which should be

// assigned to that argument (we call it a replacer).
 
// A simplified version of hints workflow looks like this:

//        1. Fuzzer launches a program (we call it a hint seed) and collects all

// the comparisons' data for every syscall in the program.

//        2. Next it tries to match the obtained comparison operands' values

// vs. the input arguments' values.

//        3. For every such match the fuzzer mutates the program by

// replacing the pointed argument with the saved value.

//        4. If a valid program is obtained, then fuzzer launches it and

// checks if new coverage is obtained.

// For more insights on particular mutations please see prog/hints_test.go.
 
/*

    简单解释一下：

    一个hint是一个元组，它由一个指向syscall的一个参数的指针和一个value组成，这个值应当被
赋予到对应的参数上，在syzkaller中被称作一个replacer。

    一个简单的hints的工作流程如下：

    1、Fuzzer启动一个程序(这个程序被称为hint种子)并且收集这个程序中每一个syscall的比较数据。

    2、下一步Fuzzer尝试把获得的比较操作数与输入的参数值进行匹配。

    3、对于每一对匹配成功的值，fuzzer通过替换对应指针保存的值来对程序进行变异。

    4、如果能获得一个有效的程序，然后fuzzer启动程序，检查有没有新的覆盖情况生成。

*/

/* 原注释如下：

// Mutates the program using the comparison operands stored in compMaps.

// For each of the mutants executes the exec callback.

     变异程序时使用储存在compMaps里面的对比操作数。

 对于每一个变异回调参数exec。
 
     这个函数主要就调用了一个函数：ForeachArg()。其余出现的execValidate、generateHints()、

 sanitize()都以参数或其他的形式传入，都没有直接调用。

 */

func (p *Prog) MutateWithHints(callIndex int, comps CompMap, exec func(p *Prog)) {

    p = p.Clone()

    c := p.Calls[callIndex]
 
    //把有害的call转化为无害的call，如果失败了就直接返回。

    //然后进行参数、返回值等等的检查；最后执行程序。

    execValidate := func() {

        // Don't try to fix the candidate program.

        // Assuming the original call was sanitized, we've got a bad call

        // as the result of hint substitution, so just throw it away.

        if p.Target.sanitize(c, false) != nil {

            return

        }

        p.debugValidate()

        exec(p)

    }
 
    /*

        ForeachArg()又调用了两次foreachArgImpl()。foreachArgImpl()首先执行参数表

    中的函数func(arg Arg, _ *ArgCtx)，在下面的函数中就是执行generateHints。

        然后再根据arg类型的不同做一些不同的处理，再递归    调用自身foreachArgImpl()。

    比如，如果类型属于*GroupArg，即结构体或者    数组，就先遍历，再进行递归；如果arg是*PointerArg，

    即指针或在虚拟内存空间，就先判断是否为空，如果不为空，赋值再递归；如果是*UnionArg，即联合体，

    就直接进行递归。

    */

    ForeachArg(c, func(arg Arg, _ *ArgCtx) {

        /*

            generateHints()用来生成新的hint。主要来判断arg的类型，如果是通过判断的*ConstArg，

        就调用checkConstArg(a, compMap, exec)；如果是通过判断的*DataArg，就调用

        checkDataArg(a, compMap, exec)。而这两个函数主要都要调用shrinkExpand()来生成

        replacer，也就是hints。checkDataArg()比checkConstArg()多的步骤就是需要进行字节序

        的转换，比较简单。shrinkExpand()是生成hints的核心了，其余部分基本就是为它来判断谁能

        生成hints，把这个函数单独拿出来在下面讲。

        */

        generateHints(comps, arg, execValidate)

    })
}

/* 原注释如下：

// Mutates the program using the comparison operands stored in compMaps.

// For each of the mutants executes the exec callback.

     变异程序时使用储存在compMaps里面的对比操作数。

 对于每一个变异回调参数exec。
 
     这个函数主要就调用了一个函数：ForeachArg()。其余出现的execValidate、generateHints()、

 sanitize()都以参数或其他的形式传入，都没有直接调用。

 */

func (p *Prog) MutateWithHints(callIndex int, comps CompMap, exec func(p *Prog)) {

    p = p.Clone()

    c := p.Calls[callIndex]
 
    //把有害的call转化为无害的call，如果失败了就直接返回。

    //然后进行参数、返回值等等的检查；最后执行程序。

    execValidate := func() {

        // Don't try to fix the candidate program.

        // Assuming the original call was sanitized, we've got a bad call

        // as the result of hint substitution, so just throw it away.

        if p.Target.sanitize(c, false) != nil {

            return

        }

        p.debugValidate()

        exec(p)

    }
 
    /*

        ForeachArg()又调用了两次foreachArgImpl()。foreachArgImpl()首先执行参数表

    中的函数func(arg Arg, _ *ArgCtx)，在下面的函数中就是执行generateHints。

        然后再根据arg类型的不同做一些不同的处理，再递归    调用自身foreachArgImpl()。

    比如，如果类型属于*GroupArg，即结构体或者    数组，就先遍历，再进行递归；如果arg是*PointerArg，

    即指针或在虚拟内存空间，就先判断是否为空，如果不为空，赋值再递归；如果是*UnionArg，即联合体，

    就直接进行递归。

    */

    ForeachArg(c, func(arg Arg, _ *ArgCtx) {

        /*

            generateHints()用来生成新的hint。主要来判断arg的类型，如果是通过判断的*ConstArg，

        就调用checkConstArg(a, compMap, exec)；如果是通过判断的*DataArg，就调用

        checkDataArg(a, compMap, exec)。而这两个函数主要都要调用shrinkExpand()来生成

        replacer，也就是hints。checkDataArg()比checkConstArg()多的步骤就是需要进行字节序

        的转换，比较简单。shrinkExpand()是生成hints的核心了，其余部分基本就是为它来判断谁能

        生成hints，把这个函数单独拿出来在下面讲。

        */

        generateHints(comps, arg, execValidate)

    })
}
 
/*

    shrinkExpand()用来对integer类型的数据进行强制变窄或变宽。因为不同位数的值不能随意替换，

 所以需要对位数进行统一。例如，

// Motivation for shrink:

// void f(u16 x) {

//        u8 y = (u8)x;

//        if (y == 0xab) {...}

// }

    在上面的程序中，如果调用f(0x1234)，在比较时用的时0xab与0x34进行比较，但是进行替换的时候，却没有可以与0x1234

 进行匹配的值。但是如果只匹配缩减的值（0xab与0x34），就可以进行替换了，用0x12ab来替换0x1234.

 但是，如果把上面的if (y == 0xab) {...}替换为(y == 0xdeadbeef) {...}，我们就放弃比较，因为这时我们

 要拓宽数据，但是我们很难得到有效的值。
 
    再看下面的例子，如果给出下面的程序：

// Motivation for expand:

// void f(i8 x) {

//        i16 y = (i16)x;

//        if (y == -2) {...}

// }

 如果调用f(-1)，这时x=0xff，但是在比较时需要用0xffff与0xfffe进行比较。如果用原始的0xff就不能得到

 有效的匹配，因此我们需要进行拓宽并进行检查。
 
    注1：在进行变窄的操作时忽略变宽的情况。

    注2：变宽时直接变宽到int64. 
 
    在syzkaller/prog/hints_test.go里面提供了许多例子。
 
    例一：shrink-16-test

 // Models the following code:

 // void f(u16 w) {

 //        u8 b = (u8) w;

 //        if (b == 0xab) {...}

 //        if (w == 0xcdcd) {...}

 //  }; f(0x1234);
 
 相应的comps如下：

    comps: CompMap{

        0x34:   compSet(0xab),

        0x1234: compSet(0xcdcd),

    }
 
 得到的res:

    res: []uint64{0x12ab, 0xcdcd}
 
    从16位到8位时，高8位用原数的高8位补足。
 
 例二：shrink-32-test

 // Models the following code:

 // void f(u32 dw) {

 //        u8 b = (u8) dw

 //        i16 w = (i16) dw

 //        if (b == 0xab) {...}

 //        if (w == 0xcdcd) {...}

 //        if (dw == 0xefefefef) {...}

 //  }; f(0x12345678);
 
 相应的comps：

    comps: CompMap{

        0x78:       compSet(0xab),

        0x5678:     compSet(0xcdcd),

        0x12345678: compSet(0xefefefef),

     }

 得到的res：

    res: []uint64{0x123456ab, 0x1234cdcd, 0xefefefef}
 
 对应的位置替换，不足的用原数补足。
 
 例三：shrink-with-a-wider-replacer-test1

 // Models the following code:

 // void f(i16 w) {

 //        i8 b = (i8) w;

 //        i16 other = 0xabab;

 //        if (b == other) {...}

 //  }; f(0x1234);

 // In such code the comparison will never be true, so we don't

 // generate a hint for it.

 这个例子中if判断不可能为真，就不需要生成新的hint了，直接返回nil。
 
 相应的comps如下：

     comps: CompMap{0x34: compSet(0x1bab)}

 返回值res：

     res:   nil
 
 例四：extend-32-test
 
 // Models the following code:

 // void f(i32 dw) {

 //        i64 qw = (i32) dw;

 //        if (qw == -2) {...};

 // }; f(-1);
 
 再看一下扩展的情况，源代码中的扩展只给了输入为-1的情况，猜测可能是每一个拓展的位都

 有相同的值。其他情况就舍弃了。

 in:    0xffffffff,

 comps: CompMap{0xffffffffffffffff: compSet(0xfffffffffffffffe)},

 res:   []uint64{0xfffffffe},
 
 例五：extend-with-a-wider-replacer-test

 // Models the following code:

 // void f(i8 b) {

 //        i16 w = (i16) b;

 //        if (w == (i16) 0xfeff) {...};

 // }; f(-1);

 // There's no value for b that will make the comparison true,

 // so we don't generate hints.
 
 匹配不到，直接返回nil。
 
 in:    0xff,

 comps: CompMap{0xffffffffffffffff: compSet(0xfffffffffffffeff)},

 res:   nil,

 */

/*

    shrinkExpand()用来对integer类型的数据进行强制变窄或变宽。因为不同位数的值不能随意替换，

 所以需要对位数进行统一。例如，

// Motivation for shrink:

// void f(u16 x) {

//        u8 y = (u8)x;

//        if (y == 0xab) {...}

// }

    在上面的程序中，如果调用f(0x1234)，在比较时用的时0xab与0x34进行比较，但是进行替换的时候，却没有可以与0x1234

 进行匹配的值。但是如果只匹配缩减的值（0xab与0x34），就可以进行替换了，用0x12ab来替换0x1234.

 但是，如果把上面的if (y == 0xab) {...}替换为(y == 0xdeadbeef) {...}，我们就放弃比较，因为这时我们

 要拓宽数据，但是我们很难得到有效的值。
 
    再看下面的例子，如果给出下面的程序：

// Motivation for expand:

// void f(i8 x) {

//        i16 y = (i16)x;

//        if (y == -2) {...}

// }

 如果调用f(-1)，这时x=0xff，但是在比较时需要用0xffff与0xfffe进行比较。如果用原始的0xff就不能得到

 有效的匹配，因此我们需要进行拓宽并进行检查。
 
    注1：在进行变窄的操作时忽略变宽的情况。

    注2：变宽时直接变宽到int64. 
 
    在syzkaller/prog/hints_test.go里面提供了许多例子。
 
    例一：shrink-16-test

 // Models the following code:

 // void f(u16 w) {

 //        u8 b = (u8) w;

 //        if (b == 0xab) {...}

 //        if (w == 0xcdcd) {...}

 //  }; f(0x1234);
 
 相应的comps如下：

    comps: CompMap{

        0x34:   compSet(0xab),

        0x1234: compSet(0xcdcd),

    }
 
 得到的res:

    res: []uint64{0x12ab, 0xcdcd}
 
    从16位到8位时，高8位用原数的高8位补足。
 
 例二：shrink-32-test

 // Models the following code:

 // void f(u32 dw) {

 //        u8 b = (u8) dw

 //        i16 w = (i16) dw

				登录后可查看完整内容
			
[培训]内核驱动高级班，冲击BAT一流互联网大厂工作，每周日13:00-18:00直播授课

		最后于  2021-6-26 20:58		
				被wx_左撇子编辑
				
		，原因： 		
	
		#Fuzz
		#自动化挖掘
		#Linux
	
上传的附件：

			myFuzz2.md
		
		（15.22kb，12次下载）