根据GitHub Security Lab CTF 2: U-Boot Challenge 来学习codeql。
官方的那个github机器人好像已经没了?hmmm... 记录下入门的过程。
略:)
好吧,安装与配置请参考:https://blog.csdn.net/weixin_43847838/article/details/130657057
如果报错ssl timeout,那就在/etc/hosts
里加上github的ip,比如这样:140.82.113.3 github.com
查询github ip的网址:https://www.ipaddress.com/site/github.com
下载:U-Boot CodeQL database
然后解压
VSCode:File
->Open workspace from file...
->选择vscode-codeql-starter.code-workspace
VSCode Extension CODEQL:From a folder
(因为我前面解压了)
参考:choosing-a-database
这个数据库对应的u-boot版本是d0d07ba86afc8074d79e436b1ba4478fa0f0c1b5
如果要自己创建数据库,那就clone这个版本的代码,然后再用codeql-cli创建
现在是2023年,官方的仓库没了?那就在github上找一个。
File
-> Add Folder to Workspace...
添加完之后长下面这样。
到这里,环境就算搭建好了。接下来看看这个实验的描述:
The goal of this challenge is to find the 13 remote-code-execution vulnerabilities that our security researchers found in the U-Boot loader . The vulnerabilities can be triggered when U-Boot is configured to use the network for fetching the next stage boot resources. MITRE has issued the following CVEs for the 13 vulnerabilities: CVE-2019-14192 , CVE-2019-14193 , CVE-2019-14194 , CVE-2019-14195 , CVE-2019-14196 , CVE-2019-14197 , CVE-2019-14198 , CVE-2019-14199 , CVE-2019-14200 , CVE-2019-14201 , CVE-2019-14202 , CVE-2019-14203 , and CVE-2019-14204 .
Through these vulnerabilities an attacker in the same network (or controlling a malicious NFS server) could gain code execution at the U-Boot powered device. The first two occurrences of the vulnerability were plain memcpy overflows with an attacker-controlled size coming from the network packet without any validation . The memcpy
function copies n
bytes from memory area src
to memory area dest
. This can be unsafe when the size being parsed is not appropriately validated, allowing an attacker to fully control the data and length being passed through.
U-Boot contains hundreds of calls to memcpy
and libc
functions that read from the network such as ntohl
and ntohs
. In this challenge, you will use CodeQL to find those calls. Of course many of those calls are safe, so throughout this challenge you will refine your query to reduce the number of false positives.
Upon completion of the challenge, you will have a query that is able to find many of the vulnerabilities that allow for remote execution of arbitrary code on U-Boot powered devices.
把下面这段代码拷贝到3_function_definitions.ql
选中3_function_definitions.ql
,右键
->CodeQL: Run Queries in Selected Files
Question 0.0: Can you work out what the above query is doing?
Question 0.1: Modify the query to find the definition of memcpy.
Question 0.2: ntohl
, ntohll
, and ntohs
can either be functions or macros (depending on the platform where the code is compiled).
As these snapshots for U-Boot were built on Linux, we know they are going to be macros. Write a query to find the definition of these macros.
或者可以通过集合表达式来查询:
ntoh 族函数通常用来进行网络字节序到主机字节序的转换
参考:
Question 1.0: Find all the calls to memcpy
.
Question 1.1: Find all the calls to ntohl
, ntohll
, and ntohs
.
如下图,MacorInvocation.getMacro()
的功能是获取被这个Invocation访问的宏。
答案如下,看结果好像项目里没有调用ntohll宏的地方?
Question 1.2: Find the expressions that resulted in these macro invocations.
找到有Q1.1宏调用的表达式?一开始我不大理解这个问题,上面不是找到宏调用的地方了吗?然后想了想,Q1.1是要找到所有calls,而这里要找到有calls的expressions。那么前面的where限定应该不用改,MacroInvocation应该可以get Expression,那就再用代码补全再看看吧:
获取这个宏调用相关的top-level expression,并且如果top-level的expanded元素不是表达式的话,会获取失败。所以,mi.getExpr()
的结果应该是mi
的子集。果然,执行后,翻了翻结果,数量和内容和Q1.1是一样的。
另外,这个top-level expression是什么?
What is a top-level expression?
结果如下:
For this step, we want to detect cases where some data read from the network will end up being used by a call to memcpy. To do this, we’ll use the CodeQL taint tracking library, and its predicate hasFlowPath
that will tell us when some data coming from a source flows to a sink . Use the boiler plate provided below to complete your taint tracking query.
想要检测这样的情况:从网络读取的数据最终被传给memcpy使用。为了做到这个,将使用CodeQL污点跟踪库。它有一个名叫hasFlowPath
的谓词,其作用是告诉我们来自source
的数据什么时候流向sink
。用下面提供的样板来完成污点跟踪查询。这个样板在下面的Q2.1(写这句废话是因为在Q2.0加了一些东西,可能你一下看不到)。
上面文字有个链接,点进去后,是对TaintTracking模块的简单介绍:
Question 2.0: Write a QL class that finds all the top-level expressions associated with the macro invocations to the calls to ntohl
, ntohll
, and ntohs
.
当然,如果要定义一个extends Expr类的类,方法也是类似的:
后来看了这篇文章:https://milkii0.github.io/2022/06/10/CodeQLU-BootChallenge%20(CC++),发现还有exists关键词,另一种解法:
Question 2.1: Create the configuration class, by defining the source and sink. The source should be calls to ntohl
, ntohll
, or ntohs
. The sink should be the size argument of an unsafe call to memcpy.
创建一个定义了source和sink的configuration类,source应该对notch*
的调用,sink则应该是不安全函数memcpy的size参数。
ntoh 族函数通常用来进行网络字节序到主机字节序的转换。所以这里的意思应该是noth族函数会将外部传入的数据包中的某些数据转换一个数值,而这个数值可能最终会被传给memcpy作为size参数,使得拷贝的长度被攻击者控制,就可能会产生安全风险。
先研究一下模板,先从“main函数”部分开始看:
看看cfg.hasFlowPath的描述,跳到定义看看:
再来看看Configuration类的子类:
提炼一下官方文档 configuration 对这个类的描述:
按照一般写代码的经验,isSource
和isSink
这两个is开头的谓词(函数)应该是写:判断这个source/sink要满足什么条件,然后返回true/false。但是呢,搂一眼介绍predicate的官方文档 ,按照文档上面的例子来看,只需要写需要满足什么条件就ok:
Predicates are used to describe the logical relations that make up a QL program. 谓词用来描述QL程序里的逻辑关系
Strictly speaking, a predicate evaluates to a set of tuples. 严格来讲,谓词的计算结果是一组元组
那大概就知道怎么写这两个谓词了:
Question 3.0: There are 13 known vulnerabilities in U-Boot.
The query you completed above probably found 9 of them. See if you can refine your query to find 1 or more additional vulnerabilities.
上面完成的查询可能找到13个漏洞中的9个。这里让我们尝试改进查询,以找到更多的漏洞。
在改进前,我们先根据前面的查询结果看看是不是已经能找到9个漏洞了。
结果【1】和【2】的source是在函数net_process_received_packet
的末尾,此处很明显会发生integer underflow:
然后将underflow的整数传递进函数nc_input_packet
,但是第149行检测了len的大小,这样我觉得即使传入一个很大的len,这里的memcpy也不会发生越界写。。然而我查了一下,U-Boot NFS RCE Vulnerabilities (CVE-2019-14192) 说这里是会越界写的。。。但是他就一句话,没懂。。。
然后找了一下patch ,对应的编号有CVE-2019-14192、 CVE-2019-14193 和 CVE-2019-14199。三个?
结果【1】【2】的source边上很明显还有一个整数回绕,将underflow的整数传递给函数指针udp_packet_handler
,QL的查询结果里没有此处。
先审计一下是否存在漏洞,搜索udp_packet_handler
,只有两处赋值的地方:
再搜索函数net_set_udp_handler
,可以看到分别给nc、dhcp、bootp和nfs都设置了handler。
不想一个一个审计了,直接看文章U-Boot NFS RCE Vulnerabilities (CVE-2019-14192) 知道漏洞都存在于nfs_handler
。那就只审计它。可以看到,该函数仍然未校验len,直接在不同分支里将其分别传递给函数:
所以,这里有6个漏洞,经查询patch ,对应的CVE编号为:CVE-2019-14197、CVE-2019-14200、 CVE-2019-14201、CVE-2019-14202、CVE-2019-14203 和 CVE-2019-14204。
结果【4】【5】对应存在一个漏洞,从packet获取rlen之后,没有校验就直接传给memcpy。
经查询,CVE编号为:CVE-2019-14195,patch 如下:
【6】nfs_lookup_reply()
,目标buffer:filefh的长度是64,传入的长度是负数的时候就会越界写。漏洞+1。
查询后,该漏洞编号是CVE-2019-14196,patch 如下:
继续,结果【7】【8】又对应两个漏洞,他两的souce分别在函数nfs_read_reply
的if和else分支里,将可控的rlen传入函数store_block
中进行memcpy,可进行任意长度的越界写。96行的函数flash_write
同样存在问题。
分别对应的CVE编号:CVE-2019-14194/CVE-2019-14198。patch 如下:
Question 3.1: Generalize your query to find other untrusted inputs (not only networking) such as ext4 fs.
菜鸡不懂,ext4 fs的输入有啥类似ntoh*
的特征?
简单入了个门吧。。。
git clone
-
-
recursive https:
/
/
github.com
/
github
/
vscode
-
codeql
-
starter
git clone
-
-
recursive https:
/
/
github.com
/
github
/
vscode
-
codeql
-
starter
❯ unzip u
-
boot_u
-
boot_cpp
-
srcVersion_d0d07ba86afc8074d79e436b1ba4478fa0f0c1b5
-
dist_odasa
-
2019
-
07
-
25
-
linux64.
zip
❯ ll
total
1011776
drwxr
-
xr
-
x
5
lzx staff
160B
6
6
23
:
33
attach
drwxr
-
xr
-
x
38
lzx staff
1.2K
6
5
23
:
02
codeql
drwxr
-
xr
-
x@
22
lzx staff
704B
5
24
18
:
28
codeql
-
cli
-
rw
-
r
-
-
r
-
-
@
1
lzx staff
420M
6
5
23
:
01
codeql
-
cli.
zip
-
rw
-
r
-
-
r
-
-
@
1
lzx staff
2.1K
6
7
00
:
00
codeql.md
-
rw
-
r
-
-
r
-
-
@
1
lzx staff
74M
6
5
23
:
21
u
-
boot_u
-
boot_cpp
-
srcVersion_d0d07ba86afc8074d79e436b1ba4478fa0f0c1b5
-
dist_odasa
-
2019
-
07
-
25
-
linux64.
zip
drwxr
-
xr
-
x@
9
lzx staff
288B
6
7
00
:
01
u
-
boot_u
-
boot_d0d07ba <
-
-
解压结果
drwxr
-
xr
-
x
21
lzx staff
672B
6
6
22
:
51
vscode
-
codeql
-
starter
❯ unzip u
-
boot_u
-
boot_cpp
-
srcVersion_d0d07ba86afc8074d79e436b1ba4478fa0f0c1b5
-
dist_odasa
-
2019
-
07
-
25
-
linux64.
zip
❯ ll
total
1011776
drwxr
-
xr
-
x
5
lzx staff
160B
6
6
23
:
33
attach
drwxr
-
xr
-
x
38
lzx staff
1.2K
6
5
23
:
02
codeql
drwxr
-
xr
-
x@
22
lzx staff
704B
5
24
18
:
28
codeql
-
cli
-
rw
-
r
-
-
r
-
-
@
1
lzx staff
420M
6
5
23
:
01
codeql
-
cli.
zip
-
rw
-
r
-
-
r
-
-
@
1
lzx staff
2.1K
6
7
00
:
00
codeql.md
-
rw
-
r
-
-
r
-
-
@
1
lzx staff
74M
6
5
23
:
21
u
-
boot_u
-
boot_cpp
-
srcVersion_d0d07ba86afc8074d79e436b1ba4478fa0f0c1b5
-
dist_odasa
-
2019
-
07
-
25
-
linux64.
zip
drwxr
-
xr
-
x@
9
lzx staff
288B
6
7
00
:
01
u
-
boot_u
-
boot_d0d07ba <
-
-
解压结果
drwxr
-
xr
-
x
21
lzx staff
672B
6
6
22
:
51
vscode
-
codeql
-
starter
proxychains4 git clone https:
/
/
github.com
/
u
-
boot
/
u
-
boot.git
cd u
-
boot
git reset
-
-
hard d0d07ba86afc8074d79e436b1ba4478fa0f0c1b
proxychains4 git clone https:
/
/
github.com
/
u
-
boot
/
u
-
boot.git
cd u
-
boot
git reset
-
-
hard d0d07ba86afc8074d79e436b1ba4478fa0f0c1b
git clone https:
/
/
github.com
/
hluwa
/
codeql
-
uboot.git
git clone https:
/
/
github.com
/
hluwa
/
codeql
-
uboot.git
import
cpp
from
Function f
where f.getName()
=
"strlen"
select f,
"a function named strlen"
import
cpp
from
Function f
where f.getName()
=
"strlen"
select f,
"a function named strlen"
import
cpp
from
Function f
where f.getName()
=
"memcpy"
select f,
"a function named memcpy"
import
cpp
from
Function f
where f.getName()
=
"memcpy"
select f,
"a function named memcpy"
import
cpp
from
Macro m
where m.getName().regexpMatch(
"ntoh(l|ll|s)"
)
select m,
"ntohl, ntohll, and ntohs"
import
cpp
from
Macro m
where m.getName().regexpMatch(
"ntoh(l|ll|s)"
)
select m,
"ntohl, ntohll, and ntohs"
import
cpp
from
Macro m
/
/
where m.getName().regexpMatch(
"ntoh(l|ll|s)"
)
/
/
select m,
"ntohl, ntohll, and ntohs"
/
/
where <your_variable_name>
in
[“bar”, “baz”, “quux”]
where m.getName()
in
[
"ntohs"
,
"ntohl"
,
"ntohll"
]
select m,
"ntohl, ntohll, and ntohs 22222"
import
cpp
from
Macro m
/
/
where m.getName().regexpMatch(
"ntoh(l|ll|s)"
)
/
/
select m,
"ntohl, ntohll, and ntohs"
/
/
where <your_variable_name>
in
[“bar”, “baz”, “quux”]
where m.getName()
in
[
"ntohs"
,
"ntohl"
,
"ntohll"
]
select m,
"ntohl, ntohll, and ntohs 22222"
import
cpp
from
FunctionCall fc
/
/
FunctionCall.getTarget():返回值类型的是Function,功能是获取被这个函数调用fc所调用的函数
where fc.getTarget().getName()
=
"memcpy"
/
/
如果fc调用的函数的名称是memcpy
select fc
import
cpp
from
FunctionCall fc
/
/
FunctionCall.getTarget():返回值类型的是Function,功能是获取被这个函数调用fc所调用的函数
where fc.getTarget().getName()
=
"memcpy"
/
/
如果fc调用的函数的名称是memcpy
select fc
import
cpp
from
MacroInvocation mi
where mi.getMacro().getName().regexpMatch(
"ntoh(l|ll|s)"
)
select mi
import
cpp
from
MacroInvocation mi
where mi.getMacro().getName().regexpMatch(
"ntoh(l|ll|s)"
)
select mi
import
cpp
from
MacroInvocation mi
where mi.getMacro().getName().regexpMatch(
"ntoh(l|ll|s)"
)
select mi.getExpr()
import
cpp
from
MacroInvocation mi
where mi.getMacro().getName().regexpMatch(
"ntoh(l|ll|s)"
)
select mi.getExpr()
import
cpp
/
/
定义一个类:
/
/
1.
要有
class
关键字
/
/
2.
类名首字母必须大写
/
/
3.
类的supertypes需要由关键字 extends 或者 instanceof 来声明
/
/
4.
类的body要闭合
class
MyMacroInvocation extends MacroInvocation{
/
/
这个类继承MacroInvocation
MacroInvocation mi;
/
/
声明一个宏调用的变量
MyMacroInvocation(){
/
/
characteristic predicate, 类似构造函数
/
/
mi满足下面的条件,并且this等于mi
mi.getMacro().getName().regexpMatch(
"ntoh(l|ll|s)"
)
and
this
=
mi
}
}
from
MyMacroInvocation mmi
select mmi.getExpr()
/
/
获取满足上面条件的宏调用的表达式
import
cpp
/
/
定义一个类:
/
/
1.
要有
class
关键字
/
/
2.
类名首字母必须大写
/
/
3.
类的supertypes需要由关键字 extends 或者 instanceof 来声明
/
/
4.
类的body要闭合
class
MyMacroInvocation extends MacroInvocation{
/
/
这个类继承MacroInvocation
MacroInvocation mi;
/
/
声明一个宏调用的变量
MyMacroInvocation(){
/
/
characteristic predicate, 类似构造函数
/
/
mi满足下面的条件,并且this等于mi
mi.getMacro().getName().regexpMatch(
"ntoh(l|ll|s)"
)
and
this
=
mi
}
}
from
MyMacroInvocation mmi
select mmi.getExpr()
/
/
获取满足上面条件的宏调用的表达式
/
/
解法
3
:
import
cpp
class
MyExpr extends Expr {
MacroInvocation mi;
MyExpr(){
mi.getMacro().getName().regexpMatch(
"ntoh(l|ll|s)"
)
and
this
=
mi.getExpr()
}
}
from
MyExpr me
select me,
"33333"
/
/
解法
3
:
import
cpp
class
MyExpr extends Expr {
MacroInvocation mi;
MyExpr(){
mi.getMacro().getName().regexpMatch(
"ntoh(l|ll|s)"
)
and
this
=
mi.getExpr()
}
}
from
MyExpr me
select me,
"33333"
import
cpp
class
NetworkByteSwap extends Expr {
NetworkByteSwap() {
exists(MacroInvocation mi | mi.getMacro().getName().regexpMatch(
"ntoh.*"
) | mi.getExpr()
=
this)
}
}
from
NetworkByteSwap n
select n,
"Network byte swap"
import
cpp
class
NetworkByteSwap extends Expr {
NetworkByteSwap() {
exists(MacroInvocation mi | mi.getMacro().getName().regexpMatch(
"ntoh.*"
) | mi.getExpr()
=
this)
}
}
from
NetworkByteSwap n
select n,
"Network byte swap"
/
*
*
*
@kind path
-
problem
*
/
import
cpp
import
semmle.code.cpp.dataflow.TaintTracking
import
DataFlow::PathGraph
class
YOUR_CLASS_HERE extends Expr {
/
/
2.0
Todo
}
class
Config extends TaintTracking::Configuration {
Config() { this
=
"NetworkToMemFuncLength"
}
override predicate isSource(DataFlow::Node source) {
/
/
2.1
Todo
}
override predicate isSink(DataFlow::Node sink) {
/
/
2.1Todo
}
from
Config cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink,
"ntoh flows to memcpy"
/
*
*
*
@kind path
-
problem
*
/
import
cpp
import
semmle.code.cpp.dataflow.TaintTracking
import
DataFlow::PathGraph
class
YOUR_CLASS_HERE extends Expr {
/
/
2.0
Todo
}
class
Config extends TaintTracking::Configuration {
Config() { this
=
"NetworkToMemFuncLength"
}
override predicate isSource(DataFlow::Node source) {
/
/
2.1
Todo
}
override predicate isSink(DataFlow::Node sink) {
/
/
2.1Todo
}
from
Config cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink,
"ntoh flows to memcpy"
/
/
1.
声明
3
个变量,Config类型的cfg,PathNode类型的source和sink
from
Config cfg, DataFlow::PathNode source, DataFlow::PathNode sink
/
/
2.
将source 和 sink 作为参数传入hasFlowPath。顾名思义,也就是判断是否有从source到sink的数据流动的path
where cfg.hasFlowPath(source, sink)
/
/
3.
打印sink 和source,不过为啥要打印两遍sink?
select sink, source, sink,
"ntoh flows to memcpy"
/
/
1.
声明
3
个变量,Config类型的cfg,PathNode类型的source和sink
from
Config cfg, DataFlow::PathNode source, DataFlow::PathNode sink
[招生]科锐逆向工程师培训(2024年11月15日实地,远程教学同时开班, 第51期)