[原创]go语言原生模糊测试:源码分析与实战-二进制漏洞-看雪-安全社区|安全招聘|kanxue.com

[原创]go语言原生模糊测试:源码分析与实战

发表于: 2022-4-22 19:57 15983

[原创]go语言原生模糊测试:源码分析与实战

ZxyNull

2022-4-22 19:57

15983

在2015年的GopherCon大会上 ,来自俄罗斯的google工程师Dmitry Vyukov 在其名为“[Go Dynamic Tools]”的presentation中就介绍了go-fuzz，上篇文章中，介绍了go-fuzz的安装和使用方法。go-fuzz在go的标准库中找到了200+的bug，在一些go项目的更是发现了上千的bug，可以说是go语言模糊测试的一个成功的第三方解决方案。但是，Dmitry Vyukov发现虽然通过第三方的fuzzing工具可以解决Go开发者关于Fuzzing的部分需求，但有很多功能特性是通过第三方工具无法实现的。 2016年，Dmitry Vyukov在Go官方issue列表中创建“cmd/compile: coverage instrumentation for fuzzing”的issue归纳说明了这些问题，也是从那时开始，Dmitry Vyukov极力推进Fuzzing进入Go原生工具链的。目前go-fuzz还存在以下几个问题：

3月16日， Go 团队终于发布 Go 1.18 ， Go 1.18 是一个包含大量新功能的版本，同时不仅改善了性能，也对语言本身做了有史以来最大的改变。 Go 1.18将fuzz testing纳入了go test工具链，与单元测试、性能基准测试等一起成为了Go原生测试工具链中的重要成员，Go也是第一个将模糊测试完全集成到其标准工具链中的主流语言。本文从源码和实践的角度对go原生的fuzzing做一个简单的介绍。

下面是官方给出使用go test -fuzz 进行的一个模糊测试的例子，突出了它的主要组成部分。

显示整体模糊测试的示例代码，其中包含一个模糊目标。在fuzz target之前是用f.Add进行语料加法，fuzz target的参数高亮显示为fuzzing参数。

以下是模糊测试必须遵循的规则。

gofuzz 是一个多进程的fuzzer，其组件可分为协调进程、工作进程和RPC。

Coordinator的职责是运行和唤醒工作进程、命令工作进行去fuzz下一个输入、如果发生crash则将interesting data 写入语料库等，该部分源码在go/src/internal/fuzz/fuzz.go 中可以找到。

结构体 CoordinateFuzzingOpts 定义了 CoordinateFuzzing 的一系类参数，包括语料库加载后的挂钟时间、生成和测试的随机值的数量、发现崩溃后的最小化时间、并行运行的worker进程数量、种子列表、语料库文件夹、构成语料库条目的类型列表等，其中部分字段被设置为0值表示没有限制，其结构体源码如下：

CoordinateFuzzing函数用来创建多个worker进程，并管理worker进程对可能触发崩溃的随机输入进行测试。如果发生崩溃，该函数将返回一个err，其中包含有关崩溃的信息。

该函数时定义了包括主时间循环在内的诸如管理worker进程的多个行为：如创建worker进程、开始worker进程、结束worker进程、确保发现的crash写入语料库、根据覆盖率信息协调工作进程等。

结构体 coordinator 定义了多个Coordinator与worker之间的channel，如coordinator传递fuzz数据到worker的channel inputC、传递最小化数据的channel minimizeC，worker传递fuzzing结果到coordinator的channel resultC等，此外还包括加载语料库后workers启动的时间 startTime 、发现的感兴趣的输入数量 interestingCount 等等，该结构体定义如下：

worker的功能主要包括种子变异、最小化、运行fuzz函数、收集覆盖率、返回Crash或新的边、等。

worker 管理运行测试二进制文件的工作进程，当且仅当进程被go -test -fuzz 唤醒时，worker对象才会存在与coordinator中。 coordinator从种子语料库和缓存语料库选择输入来进行模糊测试，使用 workerClient 向工作进程发送 RPC ， workerServer 来处理这些RPC ，下面是worker定义的结构体。

workerComm支持用于workerClint进程与workerServer进程之间通信的管道和共享内存，对共享内存的访问通过 RPC 协议隐式同步实现，workComm定义的结构体如下：

workerServer 是一个由worker进程运行的极简的 RPC 服务器，该系统允许coordinator并行运行多个worker进程，并在工作进程意外终止后从共享内存中收集导致崩溃的输入。其定义的结构体如下：

其中coverageMask定义了worker的本地覆盖数据，当新的路径被发现它会定期的更新以供coordinator参考。fuzzFn运行worker指定的fuzz目标，当发现一个crash便会返回一个error和其运行该输入花费的时间。

workerserver有以下几个方法：

server()在 fuzzIn 上读取序列化的 RPC 消息 , 当serve收到消息时 , 它调用相应的方法 , 然后将序列化的结果返回给fuzzout； fuzz() 在共享内存中根据随机输入在有限的持续时间或迭代次数内来运行测试函数，如果fuzz（）发现了crash则会提前返回； minimizeInput() 应用一系列最小化转换，确保每个最小化仍然会导致错误，或保持覆盖率；ping() 方法，coordinator调用这个方法来保证worker进程能够调用F.Fuzz并保持通信等。

workerClient 是一个极简的 RPC 客户端，coordinator进程使用其调用worker进程的方法。其结构体定义如下：

其中mu为保护workerCommd管道的互斥锁。在workerClint的方法中，与workerServer大都有一个同名的方法，用来告诉worker调用指定方法，如 workClient.fuzz() 、workClient.minimize()等

gofuzz采用覆盖率反馈的方式引导fuzzing， Dmitry在“Fuzzing support for Go”一文中曾经对coverage-guided fuzzing的引擎的工作逻辑做过如下描述：

Go 编译器已经对libFuzzer提供了检测支持，所以在gofuzz中重用了该部分。编译器为每个基本块添加一个 8 位计数器用来统计覆盖率。当coordinator接收到产生新覆盖范围的输入时，它会将该worker进程的覆盖范围与当前组合的覆盖范围数组进行比较：如果另一个worker进程已经发现了提供相同覆盖范围的输入，则把该输入丢弃。如果新的输入确实提供了新的覆盖，则coordinator将其发送回worker进程（可能是不同的worker）以进行最小化处理。输入越小，执行的速度往往就越快， coordinator会将最小化的输入添加到缓存语料库中，之后发送给worker以进行进一步的模糊测试。

当coordinator收到导致错误的输入时，它会再次将输入发送回worker进程以进行最小化。在这种情况下，工作人员尝试找到仍然会导致错误的较小输入，尽管不一定是相同的错误。输入最小化后，coordinator将其保存到testdata/corpus/$FuzzTarget，然后关闭工作进程，以非零状态退出。

gofuzz实现输入最小化主要通过四个循环：

最小化的相关代码在go/src/internal/fuzz/minimize.go 中.

go/src/internal/fuzz/mutator.go 实现了对初始文件的变异功能，其核心代码如下：

gofuzz 目前支持的类型有：string, []byte、int, int8, int16, int32/rune, int64、uint, uint8/byte, uint16, uint32, uint64、float32, float64、bool。以int型为例，可以看到go-native-fuzz对该型的变异方式还比较单一，加上或减去一个随机数，并判断其变异后的返回值不能超高int支持的最大范围。

而对于byte[]型的值变异会比较多样化，比如增、删、改、插、交换、亦或等等。

项目地址：https://github.com/go-yaml/yaml

该项目是作为 juju 项目的一部分在 Canonical 中开发的，基于著名的 libyaml C 库的纯 Go 端口，可以快速可靠地解析和生成 YAML 数据，使 Go 程序能够轻松地编码和解码 YAML 值。

yaml.Unmarshal()函数解码在字节切片中找到的第一个文档，并将解码后的值赋给输出值，十分适合作为我们测试的对象。

首先将项目下载的本地，然后git checkout 切换到分支 v3，之后编写fuzzing函数，创建文件fuzz_test.go。

go test -fuzz =Fuzz 开始fuzz

可选参数：

-parallel：一次运行的模糊测试进程的数量，默认值 $GOMAXPROCS。目前，在 fuzzing 期间设置 -cpu 无效。

第一行表示在开始模糊测试之前收集了“基线覆盖率”。

elapsed：自进程开始以来经过的时间量

出现panic之后立马返回，停止fuzz。

出现崩溃后，模糊引擎会将导致崩溃的输入写入该测试的种子语料库中，而且此输入会当作执行go test命令时的默认输入。根据奔溃提示，执行 go test run=FuzzUnmarshal/b27ab1d6a899fb4f0607968de5b80e930b49a0e279bd4341c515ebf9bd7e7c78 复现。

查看崩溃字符

其中第一行显示了模糊引擎文件的编码版本，下面的是构成语料库条目的值，即导致程序崩溃的输入。

再次运行go test单元测试时，会将该输入当作默认输入并引发崩溃：

使用": \xf0"验证该输入是否能正常触发panic。

运行，成功触发，证明确实存在问题。

查看原因，根据堆栈跟踪，在源码github.com/yaml/yaml.go:161和github.com/yaml/decode.go:163在加上两句print语句，查看导致panic的值。

运行结果如下，程序已经无法正常解析该输入，具体原因还未搞清，该问题在github上已经有人提出，不过目前仍然未被解决。

更新：目前CVE-2022-28948表示该漏洞

在 github上也可以在带有标签的issue上找到一些未解决和待改进的地方

https://github.com/golang/go/issues?q=is%3Aissue+is%3Aopen+label%3Afuzz

https://speakerdeck.com/david74chou/fuzzying-test-in-go?slide=20

https://go.dev/doc/fuzz/

https://pkg.go.dev/testing@master#F

https://github.com/golang/go/issues/14565

https://tonybai.com/2021/12/01/first-class-fuzzing-in-go-1-18/

https://jayconrod.com/posts/123/internals-of-go-s-new-fuzzing-system

可能会由于go语言内部的相互依赖的包的改变而导致崩溃

不在编译器的帮助下做覆盖率的插装，这会导致极端案例代码的破坏；表现不佳; 覆盖检测质量欠佳（缺失边缘） 

与go语言原生的单元测试比太过复杂

由于它使用源预处理，因此很难将其集成到其他构建系统和非标准上下文中

1. 可能会由于go语言内部的相互依赖的包的改变而导致崩溃

2.

不在编译器的帮助下做覆盖率的插装，这会导致极端案例代码的破坏；表现不佳; 覆盖检测质量欠佳（缺失边缘）

3. 与go语言原生的单元测试比太过复杂

4. 由于它使用源预处理，因此很难将其集成到其他构建系统和非标准上下文中

// go/src/internal/fuzz/fuzz.go
 
// CoordinateFuzzingOpts is a set of arguments for CoordinateFuzzing.

// The zero value is valid for each field unless specified otherwise.

type CoordinateFuzzingOpts struct {

    Log io.Writer
 
    Timeout time.Duration
 
    Limit int64
 
    MinimizeTimeout time.Duration
 
    MinimizeLimit int64
 
    Parallel int
 
    Seed []CorpusEntry
 
    Types []reflect.Type
 
    CorpusDir string
 
    CacheDir string
}

// go/src/internal/fuzz/fuzz.go

// CoordinateFuzzingOpts is a set of arguments for CoordinateFuzzing.

// The zero value is valid for each field unless specified otherwise.

type CoordinateFuzzingOpts struct {

Log io.Writer

Timeout time.Duration

Limit int64

MinimizeTimeout time.Duration

MinimizeLimit int64

Parallel int

Seed []CorpusEntry

Types []reflect.Type

CorpusDir string

CacheDir string

}

// coordinator holds channels that workers can use to communicate with

// the coordinator.

type coordinator struct {

    opts CoordinateFuzzingOpts
 
    // startTime is the time we started the workers after loading the corpus.

    // Used for logging.

    startTime time.Time
 
    // inputC is sent values to fuzz by the coordinator. Any worker may receive

    // values from this channel. Workers send results to resultC.

    inputC chan fuzzInput
 
    // minimizeC is sent values to minimize by the coordinator. Any worker may

    // receive values from this channel. Workers send results to resultC.

    minimizeC chan fuzzMinimizeInput
 
    // resultC is sent results of fuzzing by workers. The coordinator

    // receives these. Multiple types of messages are allowed.

    resultC chan fuzzResult
 
    // count is the number of values fuzzed so far.

    count int64
 
    // countLastLog is the number of values fuzzed when the output was last

    // logged.

    countLastLog int64
 
    // timeLastLog is the time at which the output was last logged.

    timeLastLog time.Time
 
    // interestingCount is the number of unique interesting values which have

    // been found this execution.

    interestingCount int
 
    // warmupInputCount is the count of all entries in the corpus which will

    // need to be received from workers to run once during warmup, but not fuzz.

    // This could be for coverage data, or only for the purposes of verifying

    // that the seed corpus doesn't have any crashers. See warmupRun.

    warmupInputCount int
 
    // warmupInputLeft is the number of entries in the corpus which still need

    // to be received from workers to run once during warmup, but not fuzz.

    // See warmupInputLeft.

    warmupInputLeft int
 
    // duration is the time spent fuzzing inside workers, not counting time

    // starting up or tearing down.

    duration time.Duration
 
    // countWaiting is the number of fuzzing executions the coordinator is

    // waiting on workers to complete.

    countWaiting int64
 
    // corpus is a set of interesting values, including the seed corpus and

    // generated values that workers reported as interesting.

    corpus corpus
 
    // minimizationAllowed is true if one or more of the types of fuzz

    // function's parameters can be minimized.

    minimizationAllowed bool
 
    // inputQueue is a queue of inputs that workers should try fuzzing. This is

    // initially populated from the seed corpus and cached inputs. More inputs

    // may be added as new coverage is discovered.

    inputQueue queue
 
    // minimizeQueue is a queue of inputs that caused errors or exposed new

    // coverage. Workers should attempt to find smaller inputs that do the

    // same thing.

    minimizeQueue queue
 
    // crashMinimizing is the crash that is currently being minimized.

    crashMinimizing *fuzzResult
 
    // coverageMask aggregates coverage that was found for all inputs in the

    // corpus. Each byte represents a single basic execution block. Each set bit

    // within the byte indicates that an input has triggered that block at least

    // 1 << n times, where n is the position of the bit in the byte. For example, a

    // value of 12 indicates that separate inputs have triggered this block

    // between 4-7 times and 8-15 times.

    coverageMask []byte
}

// coordinator holds channels that workers can use to communicate with

// the coordinator.

type coordinator struct {

opts CoordinateFuzzingOpts

// startTime is the time we started the workers after loading the corpus.

// Used for logging.

startTime time.Time

// inputC is sent values to fuzz by the coordinator. Any worker may receive

// values from this channel. Workers send results to resultC.

inputC chan fuzzInput

// minimizeC is sent values to minimize by the coordinator. Any worker may

// receive values from this channel. Workers send results to resultC.

minimizeC chan fuzzMinimizeInput

// resultC is sent results of fuzzing by workers. The coordinator

// receives these. Multiple types of messages are allowed.

resultC chan fuzzResult

// count is the number of values fuzzed so far.

count int64

// countLastLog is the number of values fuzzed when the output was last

// logged.

countLastLog int64

// timeLastLog is the time at which the output was last logged.

timeLastLog time.Time

// interestingCount is the number of unique interesting values which have

// been found this execution.

interestingCount int

// warmupInputCount is the count of all entries in the corpus which will

// need to be received from workers to run once during warmup, but not fuzz.

// This could be for coverage data, or only for the purposes of verifying

// that the seed corpus doesn't have any crashers. See warmupRun.

warmupInputCount int

// warmupInputLeft is the number of entries in the corpus which still need

// to be received from workers to run once during warmup, but not fuzz.

// See warmupInputLeft.

warmupInputLeft int

// duration is the time spent fuzzing inside workers, not counting time

// starting up or tearing down.

duration time.Duration

// countWaiting is the number of fuzzing executions the coordinator is

// waiting on workers to complete.

countWaiting int64

// corpus is a set of interesting values, including the seed corpus and

// generated values that workers reported as interesting.

corpus corpus

// minimizationAllowed is true if one or more of the types of fuzz

// function's parameters can be minimized.

minimizationAllowed bool

// inputQueue is a queue of inputs that workers should try fuzzing. This is

// initially populated from the seed corpus and cached inputs. More inputs

// may be added as new coverage is discovered.

inputQueue queue

// minimizeQueue is a queue of inputs that caused errors or exposed new

// coverage. Workers should attempt to find smaller inputs that do the

// same thing.

minimizeQueue queue

// crashMinimizing is the crash that is currently being minimized.

crashMinimizing *fuzzResult

// coverageMask aggregates coverage that was found for all inputs in the

// corpus. Each byte represents a single basic execution block. Each set bit

// within the byte indicates that an input has triggered that block at least

// 1 << n times, where n is the position of the bit in the byte. For example, a

// value of 12 indicates that separate inputs have triggered this block

// between 4-7 times and 8-15 times.

coverageMask []byte

}

type worker struct {

    dir     string   // working directory, same as package directory

    binPath string   // path to test executable

    args    []string // arguments for test executable

    env     []string // environment for test executable
 
    coordinator *coordinator
 
    memMu chan *sharedMem // mutex guarding shared memory with worker; persists across processes.
 
    cmd         *exec.Cmd     // current worker process

    client      *workerClient // used to communicate with worker process

    waitErr     error         // last error returned by wait, set before termC is closed.

    interrupted bool          // true after stop interrupts a running worker.

    termC       chan struct{} // closed by wait when worker process terminates
}

type worker struct {

dir string // working directory, same as package directory

binPath string // path to test executable

args []string // arguments for test executable

env []string // environment for test executable

coordinator *coordinator

memMu chan *sharedMem // mutex guarding shared memory with worker; persists across processes.

cmd *exec.Cmd // current worker process

client *workerClient // used to communicate with worker process

waitErr error // last error returned by wait, set before termC is closed.

interrupted bool // true after stop interrupts a running worker.

termC chan struct{} // closed by wait when worker process terminates

}

type workerComm struct {

    fuzzIn, fuzzOut *os.File

    memMu           chan *sharedMem // mutex guarding shared memory
}

type workerComm struct {

fuzzIn, fuzzOut *os.File

memMu chan *sharedMem // mutex guarding shared memory

}

type workerServer struct {

    workerComm

    m *mutator

    coverageMask []byte

    fuzzFn func(CorpusEntry) (time.Duration, error)
}

type workerServer struct {

workerComm

m *mutator

coverageMask []byte

fuzzFn func(CorpusEntry) (time.Duration, error)

}

type workerClient struct {

    workerComm

    m *mutator

    mu sync.Mutex
}

type workerClient struct {

workerComm

m *mutator

mu sync.Mutex

}

start with some (potentially empty) corpus of inputs

for {

        choose a random input from the corpus

        mutate the input

        execute the mutated input and collect code coverage

        if the input gives new coverage, add it to the corpus
}

start with some (potentially empty) corpus of inputs

for {

choose a random input from the corpus

mutate the input

execute the mutated input and collect code coverage

if the input gives new coverage, add it to the corpus

}

func (m *mutator) mutate(vals []any, maxBytes int) {

    maxPerVal := maxBytes/len(vals) - 100

    i := m.rand(len(vals))

    switch v := vals[i].(type) {

    case int:

        vals[i] = int(m.mutateInt(int64(v), maxInt))

    case int8:

        vals[i] = int8(m.mutateInt(int64(v), math.MaxInt8))

    case int16:

        vals[i] = int16(m.mutateInt(int64(v), math.MaxInt16))

    case int64:

        vals[i] = m.mutateInt(v, maxInt)

    case uint:

        vals[i] = uint(m.mutateUInt(uint64(v), maxUint))

    case uint16:

        vals[i] = uint16(m.mutateUInt(uint64(v), math.MaxUint16))

    case uint32:

        vals[i] = uint32(m.mutateUInt(uint64(v), math.MaxUint32))

    case uint64:

        vals[i] = m.mutateUInt(uint64(v), maxUint)

    case float32:

        vals[i] = float32(m.mutateFloat(float64(v), math.MaxFloat32))

    case float64:

        vals[i] = m.mutateFloat(v, math.MaxFloat64)

    case bool:

        if m.rand(2) == 1 {

            vals[i] = !v // 50% chance of flipping the bool

        }

    case rune: // int32

        vals[i] = rune(m.mutateInt(int64(v), math.MaxInt32))

    case byte: // uint8

        vals[i] = byte(m.mutateUInt(uint64(v), math.MaxUint8))

    case string:

        ...

    case []byte:

        ...

    default:

        panic(fmt.Sprintf("type not supported for mutating: %T", vals[i]))

    }
}

func (m *mutator) mutate(vals []any, maxBytes int) {

maxPerVal := maxBytes/len(vals) - 100

i := m.rand(len(vals))

switch v := vals[i].(type) {

case int:

vals[i] = int(m.mutateInt(int64(v), maxInt))

case int8:

vals[i] = int8(m.mutateInt(int64(v), math.MaxInt8))

case int16:

vals[i] = int16(m.mutateInt(int64(v), math.MaxInt16))

case int64:

vals[i] = m.mutateInt(v, maxInt)

case uint:

vals[i] = uint(m.mutateUInt(uint64(v), maxUint))

case uint16:

vals[i] = uint16(m.mutateUInt(uint64(v), math.MaxUint16))

case uint32:

vals[i] = uint32(m.mutateUInt(uint64(v), math.MaxUint32))

case uint64:

vals[i] = m.mutateUInt(uint64(v), maxUint)

case float32:

vals[i] = float32(m.mutateFloat(float64(v), math.MaxFloat32))

case float64:

vals[i] = m.mutateFloat(v, math.MaxFloat64)

case bool:

if m.rand(2) == 1 {

vals[i] = !v // 50% chance of flipping the bool

}

case rune: // int32

vals[i] = rune(m.mutateInt(int64(v), math.MaxInt32))

case byte: // uint8

vals[i] = byte(m.mutateUInt(uint64(v), math.MaxUint8))

case string:

...

case []byte:

...

default:

panic(fmt.Sprintf("type not supported for mutating: %T", vals[i]))

}

func (m *mutator) mutateInt(v, maxValue int64) int64 {

    var max int64

    for {

        max = 100

        switch m.rand(2) {

        case 0:

            // Add a random number

            if v >= maxValue {

                continue

            }

            if v > 0 && maxValue-v < max {

                // Don't let v exceed maxValue

                max = maxValue - v

            }

            v += int64(1 + m.rand(int(max)))

            return v

        case 1:

            // Subtract a random number

            if v <= -maxValue {

                continue

            }

            if v < 0 && maxValue+v < max {

                // Don't let v drop below -maxValue

                max = maxValue + v

            }

            v -= int64(1 + m.rand(int(max)))

            return v

        }

    }
}