-
-
[原创]Linux内核[CVE-2017-5123] (waitid)原理分析
-
发表于: 2021-1-13 13:18 13033
-
cve-2017-5123 是linux内核中waitid有关的一个漏洞。
类似于wait和waitpid。
https://elixir.bootlin.com/linux/v4.13/source/kernel/exit.c
看到这里首先调用 user_access_begin()
实际上是调用了 stac()
还有 clac()
简单说一下这个CLAC 和 STAC,由于众所周知的SMAP,内核不可以随意去读写用户空间的数据(这也导致了内核对于用户空间访问的低效)。但是当 内核确实有正常的需求需要访问用户空间并进行读写时,我们需要暂时禁用smap。
However, the use of SMAP in an operating system may lead to a larger kernel size and slower user-space memory accesses from supervisor code, because SMAP must be temporarily disabled any time supervisor code intends to access user-space memory.
而谁来disable SMAP呢?这就需要:Extended Features CPUID leaf
也就是 EFLAGS
下面给出过程:
1.当 EFLAGS.AC 被 stac
设置后,就可以暂时性的disable掉SMAP,然后内核便可以对用户数据进行读写。
2.内核读写完毕用户数据,此时我们要恢复SMAP,于是调用 clac
清除 EFLAGS.AC
恢复SMAP。整个过程便完成了。
接下来调用了 unsafe_put_user
这里首先提醒我们,要配合 user_access_begin/end()
来使用 unsafe_put_user
并且,要在这之前调用 access_ok()
检查!!!但是这里并没有 access_ok()
这里做了如下事情:
1.user_addr_max()
获得了 current->thread.addr_limit.seg
作为用户态地址的边界。
2.__chk_user_ptr
检查我们的参数 addr
是否是指向用户态的
3.__range_not_ok
检查 addr + size
和 limit
的大小关系,即addr + size
是否也指向用户态
我们再找一个x86下对于 access_ok
的引用:
这里检查frame的地址是否是用户态的。setup_frame
是在做信号处理的时候,需要先返回到用户态执行信号处理程序,执行完信号处理程序后再返回到内核态,再在内核态完成收尾工作。而从内核态返回到用户态时,CPU要从内核栈中找到返回到用户态的地址(就是调用系统调用的下一条代码指令地址)Linux为了先让信号处理程序执行,所以就需要把这个返回地址修改为信号处理程序的入口,这样当从系统调用返回到用户态时,就可以执行信号处理程序了。
当用户态的signal handler执行完毕后,重新切回内核态。(通过sigreturn()
系统调用,在 sigreturn()
中恢复原来内核栈的内容,这里也是由于 sigreturn
没有对数据做检查就弹到寄存器里而产生了 SROP 这种攻击)
之后进行内核态的收尾处理。
我们再根据源码来梳理一下:
1.由于没有在调用 unsafe_put_user
前调用 access_ok
导致将 unsafe_put_user(x, ptr, err_label)
其中的x写到了任意的ptr。
2.更进一步的,针对 unsafe_put_user(0, &infop->si_errno, Efault)
我们可以写一个空字节到一个可控的任意地址!
3.如果我们将空字节向 cred.uid
写的话,就可以完成提权。
4.但是我们没有一个特别通用的方法去在进程空间内找到他的cred结构体的位置。
朴素思想:此漏洞允许无特权的用户在调用waitid()时通过使用infop来指定内核地址,内核将会不带检查的对其进行写入。
而 infop 是一个 struct siginfo __user *
下面谈几种可用的攻击方式:
1.堆喷:大量的进行fork()创建进程,每个进程都会对应一个cred结构体,然后任意写某一个进程cred的uid,之后 getuid()
检测是否有哪一个进程的uid被清零(提权)
2.ret2dir:首先找到用户区域和内核区域对应的physmap的地址,在physmap中写payload,然后找到内核对应的physmap的虚拟地址,最后把内核态的执行流拉到内核对应的physmap地址上。
3.通过爆破struct file
的地址,然后找到file结构体中指向当前的cred结构体的指针,接下来直接任意写当前的cred结构体。(之前做祥云杯的内核pwn的时候感觉在不开启kaslr下爆破file结构体比直接爆破cred结构体要快?)
4.在 exploit_null_ptr_deref 分析笔记) 中 De4dCr0w师傅 提到了,可以利用覆写have_canfork_callback触发空指针引用fork()提权。
考虑如下调用流:
而在 cgroup_can_fork
中:
调用了 do_each_subsys_mask
这里的 ss_mask
就是 have_canfork_callback
。ss
是未初始化的 struct cgroup_subsys
指针。CGROUP_SUBSYS_COUNT
为 0.
ssid
为 i。
这里将我们的 have_canfork_callback
的地址作为addr(指向位图)。bit为 ssid 即 i,size为 CGROUP_SUBSYS_COUNT
(查找范围)。
而 Linux Kernel Inside 中对其的解释如下:
All of these macros provide iterator over certain set of bits in a bit array. The first macro iterates over bits which are set, the second does the same, but starts from a certain bits. The last two macros do the same, but iterates over clear bits.
也就是说 for_each_set_bit
迭代的是那些被置位的bits。
具体的,find_first_bit
在位图中查找第一个为1的bit位。
find_next_bit
在查找范围内,从bit+1开始,接着找第一个为1的bit位。
所以这里 for_each_set_bit
就是在范围内,查找所有的被置位的bit。返回的是位图 have_canfork_callback 中小于 CGROUP_SUBSYS_COUNT 的最后一个被置位的bit的位置(在上层函数中就是ssid)。
然后将其作为数组下标,获取 cgroup_subsys[ssid]
处的值赋给 ss
最后在cgroup_can_fork
会有一个针对ret = ss->can_fork(child)
的调用。其中cgroup_subsys
是一个虚表如下:
这张表在创建之初是空的,也就是说 ss->can_fork
为NULL。
这个rdi就是 have_canfork_callback
此时本次 find_first_bit
返回值如下:
接下来做了一个cmp,如果返回值小于等于0xc,那么就跳转。调用偏移0x50的 can_fork
如果我们让 can_fork
为0,然后去调用这个 can_fork
那么就实现了控制流向0地址的一个转移,然后我们将事先准备好的shellcode放到0地址,就可以在0地址上执行shellcode。
这里发生了这样的问题:
mmap的时候:
这个低地址不可映射。。。最终爆炸。。
我尝试了如下方法,无果:
https://blog.csdn.net/cosmoslhf/article/details/39101999
但是我发现如果我用root
启动内核,那么这个0地址映射是成功的。而这个 cap_capable
在检查进程的权限。
同时这篇文章:https://www.cnblogs.com/redstar9451/p/6645579.html 也提到了,要在root下做0地址映射。
在这里提到了:https://blog.csdn.net/airuozhaoyang/article/details/99300030
由于内核空间和用户空间共享虚拟内存地址,因此需要防止用户空间mmap的内存从0开始,从而缓解NULL解引用攻击。windows系统从win8开始禁止在零页分配内存。从linux内核2.6.22开始可以使用sysctl设置mmap_min_addr来实现这一保护。从Ubuntu 9.04开始,mmap_min_addr设置被内置到内核中(x86为64k,ARM为32k)
这个似乎是一个保护措施,更详细的在mmap_min_addr
但是我尝试使用上面的措施来关闭他,但是失败了。
但是我看网上用4.13.0复现这个cve的wp没有提到这个问题,于是为了学习漏洞利用的过程,我们切到root启动再继续。(如果有知道怎么解决这个问题的欢迎指导!)
首先我们断点打在这里
pwndbg> b cgroup_can_fork
Breakpoint 1 at 0xffffffff811264a0: file kernel/cgroup/cgroup.c, line 4812.
但是问题在于在我这个4.13的版本下,CGROUP_SUBSYS_COUNT的默认值并不是4,而是0xc;并且在一个新的cgroup_subsys中的can_fork函数也不是0,而是已经被定义好的 pids_can_fork。
这个初始化出现在:https://elixir.bootlin.com/linux/v4.13/source/kernel/cgroup/pids.c#L343
这个版本的exp在我这里调用失败了。原因如下:
1.非root用户无法做0地址映射。
2.cgroup_subsys会直接被初始化,导致can_fork不为0。
不过这里给出几个用0地址的exp成功的师傅的博客:
https://bbs.pediy.com/thread-247014.htm
https://freewechat.com/a/MjM5NTc2MDYxMw==/2458292173/1
https://x3h1n.github.io/2019/12/30/CVE-2017-5123%E5%A4%8D%E7%8E%B0/
这里对应的在:
条件:
1.已知我们可以通过 unsafe_put_user
将 0 写入内核任意位置。
2.如果我们知道某一个cred结构体的位置,那么直接去写他的uid和euid,就可以实现提权。
3.waitpid在非法内存访问时不会崩溃,而是返回错误代码,基于此,也可以进行内存的爆破or探测。(-EFAULT)
方法:
1.使用clone函数创建多个轻量级process,那么内核中会存在许多的cred结构体。
2.观察每个cred结构体euid的位置。可以通过如下驱动:
https://reverse.put.as/2017/11/07/exploiting-cve-2017-5123/
可以看到偏移还是在某一个范围内的。
效果:
其中 cve-2017-5123是一个权限400的文件。内容是字符串success。
可以看到此时已经劫持了euid。
此时euid为0,等同于root。此时读出了root权限的文件。
ps:至于为什么在劫持了euid后要setuid可以看:https://www.douban.com/note/310087353/
patch:waitid(): switch copyout of siginfo to unsafe_put_user()
Supervisor Mode Access Prevention
Kernel exploitation - CVE-2017-5123 PoC e Writeup
Exploiting CVE-2017-5123 with full protections. SMEP, SMAP, and the Chrome Sandbox!
Bit arrays and bit operations in the Linux kernel
linux内核追踪——find_next_bit函数详详详解
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/wait.h>
#include <string.h>
int
main(){
siginfo_t info;
id_t
id
;
int
ret_val;
pid_t pid
=
fork();
if
(pid <
0
){
perror(
"fork failed"
);
exit(
-
1
);
}
else
if
(pid
=
=
0
){
/
/
in
child process
printf(
"This is child process!\n"
);
exit(
8
);
}
else
{
/
/
in
father process
memset(&info,
0
, sizeof(siginfo_t));
/
/
P_ALL代表等待任何一个进程
/
/
退出信息在info结构体中
/
/
Wait
for
children that have terminated.
ret_val
=
waitid(P_ALL,
id
, &info , WEXITED);
if
(ret_val <
0
){
perror(
"waitid failed"
);exit(
-
2
);
}
if
(info.si_code
=
=
CLD_EXITED)
{
printf(
"si_code: _exit\n"
);
}
printf(
"si_status = %d\n"
, info.si_status);
}
return
0
;
}
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/wait.h>
#include <string.h>
int
main(){
siginfo_t info;
id_t
id
;
int
ret_val;
pid_t pid
=
fork();
if
(pid <
0
){
perror(
"fork failed"
);
exit(
-
1
);
}
else
if
(pid
=
=
0
){
/
/
in
child process
printf(
"This is child process!\n"
);
exit(
8
);
}
else
{
/
/
in
father process
memset(&info,
0
, sizeof(siginfo_t));
/
/
P_ALL代表等待任何一个进程
/
/
退出信息在info结构体中
/
/
Wait
for
children that have terminated.
ret_val
=
waitid(P_ALL,
id
, &info , WEXITED);
if
(ret_val <
0
){
perror(
"waitid failed"
);exit(
-
2
);
}
if
(info.si_code
=
=
CLD_EXITED)
{
printf(
"si_code: _exit\n"
);
}
printf(
"si_status = %d\n"
, info.si_status);
}
return
0
;
}
typedef struct siginfo {
int
si_signo;
int
si_errno;
int
si_code;
union {
int
_pad[SI_PAD_SIZE];
/
*
kill()
*
/
struct {
__kernel_pid_t _pid;
/
*
sender's pid
*
/
__ARCH_SI_UID_T _uid;
/
*
sender's uid
*
/
} _kill;
/
*
POSIX.
1b
timers
*
/
struct {
__kernel_timer_t _tid;
/
*
timer
id
*
/
int
_overrun;
/
*
overrun count
*
/
char _pad[sizeof( __ARCH_SI_UID_T)
-
sizeof(
int
)];
sigval_t _sigval;
/
*
same as below
*
/
int
_sys_private;
/
*
not
to be passed to user
*
/
} _timer;
/
*
POSIX.
1b
signals
*
/
struct {
__kernel_pid_t _pid;
/
*
sender's pid
*
/
__ARCH_SI_UID_T _uid;
/
*
sender's uid
*
/
sigval_t _sigval;
} _rt;
/
*
SIGCHLD
*
/
struct {
__kernel_pid_t _pid;
/
*
which child
*
/
__ARCH_SI_UID_T _uid;
/
*
sender's uid
*
/
int
_status;
/
*
exit code
*
/
__ARCH_SI_CLOCK_T _utime;
__ARCH_SI_CLOCK_T _stime;
} _sigchld;
/
*
SIGILL, SIGFPE, SIGSEGV, SIGBUS
*
/
struct {
void __user
*
_addr;
/
*
faulting insn
/
memory ref.
*
/
#ifdef __ARCH_SI_TRAPNO
int
_trapno;
/
*
TRAP
# which caused the signal */
#endif
short _addr_lsb;
/
*
LSB of the reported address
*
/
union {
/
*
used when si_code
=
SEGV_BNDERR
*
/
struct {
void __user
*
_lower;
void __user
*
_upper;
} _addr_bnd;
/
*
used when si_code
=
SEGV_PKUERR
*
/
__u32 _pkey;
};
} _sigfault;
/
*
SIGPOLL
*
/
struct {
__ARCH_SI_BAND_T _band;
/
*
POLL_IN, POLL_OUT, POLL_MSG
*
/
int
_fd;
} _sigpoll;
/
*
SIGSYS
*
/
struct {
void __user
*
_call_addr;
/
*
calling user insn
*
/
int
_syscall;
/
*
triggering system call number
*
/
unsigned
int
_arch;
/
*
AUDIT_ARCH_
*
of syscall
*
/
} _sigsys;
} _sifields;
} __ARCH_SI_ATTRIBUTES siginfo_t;
typedef struct siginfo {
int
si_signo;
int
si_errno;
int
si_code;
union {
int
_pad[SI_PAD_SIZE];
/
*
kill()
*
/
struct {
__kernel_pid_t _pid;
/
*
sender's pid
*
/
__ARCH_SI_UID_T _uid;
/
*
sender's uid
*
/
} _kill;
/
*
POSIX.
1b
timers
*
/
struct {
__kernel_timer_t _tid;
/
*
timer
id
*
/
int
_overrun;
/
*
overrun count
*
/
char _pad[sizeof( __ARCH_SI_UID_T)
-
sizeof(
int
)];
sigval_t _sigval;
/
*
same as below
*
/
int
_sys_private;
/
*
not
to be passed to user
*
/
} _timer;
/
*
POSIX.
1b
signals
*
/
struct {
__kernel_pid_t _pid;
/
*
sender's pid
*
/
__ARCH_SI_UID_T _uid;
/
*
sender's uid
*
/
sigval_t _sigval;
} _rt;
/
*
SIGCHLD
*
/
struct {
__kernel_pid_t _pid;
/
*
which child
*
/
__ARCH_SI_UID_T _uid;
/
*
sender's uid
*
/
int
_status;
/
*
exit code
*
/
__ARCH_SI_CLOCK_T _utime;
__ARCH_SI_CLOCK_T _stime;
} _sigchld;
/
*
SIGILL, SIGFPE, SIGSEGV, SIGBUS
*
/
struct {
void __user
*
_addr;
/
*
faulting insn
/
memory ref.
*
/
#ifdef __ARCH_SI_TRAPNO
int
_trapno;
/
*
TRAP
# which caused the signal */
#endif
short _addr_lsb;
/
*
LSB of the reported address
*
/
union {
/
*
used when si_code
=
SEGV_BNDERR
*
/
struct {
void __user
*
_lower;
void __user
*
_upper;
} _addr_bnd;
/
*
used when si_code
=
SEGV_PKUERR
*
/
__u32 _pkey;
};
} _sigfault;
/
*
SIGPOLL
*
/
struct {
__ARCH_SI_BAND_T _band;
/
*
POLL_IN, POLL_OUT, POLL_MSG
*
/
int
_fd;
} _sigpoll;
/
*
SIGSYS
*
/
struct {
void __user
*
_call_addr;
/
*
calling user insn
*
/
int
_syscall;
/
*
triggering system call number
*
/
unsigned
int
_arch;
/
*
AUDIT_ARCH_
*
of syscall
*
/
} _sigsys;
} _sifields;
} __ARCH_SI_ATTRIBUTES siginfo_t;
SYSCALL_DEFINE5(waitid,
int
, which, pid_t, upid, struct siginfo __user
*
,
infop,
int
, options, struct rusage __user
*
, ru)
{
struct rusage r;
struct waitid_info info
=
{.status
=
0
};
long
err
=
kernel_waitid(which, upid, &info, options, ru ? &r : NULL);
int
signo
=
0
;
if
(err >
0
) {
signo
=
SIGCHLD;
err
=
0
;
}
if
(!err) {
if
(ru && copy_to_user(ru, &r, sizeof(struct rusage)))
return
-
EFAULT;
}
if
(!infop)
return
err;
user_access_begin();
/
/
暂时关闭smap
unsafe_put_user(signo, &infop
-
>si_signo, Efault);
unsafe_put_user(
0
, &infop
-
>si_errno, Efault);
unsafe_put_user((short)info.cause, &infop
-
>si_code, Efault);
unsafe_put_user(info.pid, &infop
-
>si_pid, Efault);
unsafe_put_user(info.uid, &infop
-
>si_uid, Efault);
unsafe_put_user(info.status, &infop
-
>si_status, Efault);
user_access_end();
/
/
重新开启smap
return
err;
Efault:
user_access_end();
return
-
EFAULT;
}
SYSCALL_DEFINE5(waitid,
int
, which, pid_t, upid, struct siginfo __user
*
,
infop,
int
, options, struct rusage __user
*
, ru)
{
struct rusage r;
struct waitid_info info
=
{.status
=
0
};
long
err
=
kernel_waitid(which, upid, &info, options, ru ? &r : NULL);
int
signo
=
0
;
if
(err >
0
) {
signo
=
SIGCHLD;
err
=
0
;
}
if
(!err) {
if
(ru && copy_to_user(ru, &r, sizeof(struct rusage)))
return
-
EFAULT;
}
if
(!infop)
return
err;
user_access_begin();
/
/
暂时关闭smap
unsafe_put_user(signo, &infop
-
>si_signo, Efault);
unsafe_put_user(
0
, &infop
-
>si_errno, Efault);
unsafe_put_user((short)info.cause, &infop
-
>si_code, Efault);
unsafe_put_user(info.pid, &infop
-
>si_pid, Efault);
unsafe_put_user(info.uid, &infop
-
>si_uid, Efault);
unsafe_put_user(info.status, &infop
-
>si_status, Efault);
user_access_end();
/
/
重新开启smap
return
err;
Efault:
user_access_end();
return
-
EFAULT;
}
#define user_access_begin() __uaccess_begin()
#define user_access_end() __uaccess_end()
#define __uaccess_begin() stac()
#define __uaccess_end() clac()
#define user_access_begin() __uaccess_begin()
#define user_access_end() __uaccess_end()
#define __uaccess_begin() stac()
#define __uaccess_end() clac()
#ifdef CONFIG_X86_SMAP //是否开始smap?
static __always_inline void clac(void)
{
/
*
Note: a barrier
is
implicit
in
alternative()
*
/
alternative("", __stringify(__ASM_CLAC), X86_FEATURE_SMAP);
}
static __always_inline void stac(void)
{
/
*
Note: a barrier
is
implicit
in
alternative()
*
/
alternative("", __stringify(__ASM_STAC), X86_FEATURE_SMAP);
}
/
/
其中 __ASM_CLAC 与 __ASM_STAC
/
*
"Raw"
instruction opcodes
*
/
#define __ASM_CLAC .byte 0x0f,0x01,0xca
#define __ASM_STAC .byte 0x0f,0x01,0xcb
#ifdef CONFIG_X86_SMAP //是否开始smap?
static __always_inline void clac(void)
{
/
*
Note: a barrier
is
implicit
in
alternative()
*
/
alternative("", __stringify(__ASM_CLAC), X86_FEATURE_SMAP);
}
static __always_inline void stac(void)
{
/
*
Note: a barrier
is
implicit
in
alternative()
*
/
alternative("", __stringify(__ASM_STAC), X86_FEATURE_SMAP);
}
/
/
其中 __ASM_CLAC 与 __ASM_STAC
/
*
"Raw"
instruction opcodes
*
/
#define __ASM_CLAC .byte 0x0f,0x01,0xca
#define __ASM_STAC .byte 0x0f,0x01,0xcb
/
*
*
The
"unsafe"
user accesses aren't really
"unsafe"
, but the naming
*
is
a big fat warning: you have to
not
only do the access_ok()
*
checking before using them, but you have to surround them with the
*
user_access_begin
/
end() pair.
*
/
#define user_access_begin() __uaccess_begin()
#define user_access_end() __uaccess_end()
#define unsafe_put_user(x, ptr, err_label) \
do { \
int
__pu_err; \
__typeof__(
*
(ptr)) __pu_val
=
(x); \
__put_user_size(__pu_val, (ptr), sizeof(
*
(ptr)), __pu_err,
-
EFAULT); \
if
(unlikely(__pu_err)) goto err_label; \
}
while
(
0
)
/
*
*
The
"unsafe"
user accesses aren't really
"unsafe"
, but the naming
*
is
a big fat warning: you have to
not
only do the access_ok()
*
checking before using them, but you have to surround them with the
*
user_access_begin
/
end() pair.
*
/
#define user_access_begin() __uaccess_begin()
#define user_access_end() __uaccess_end()
#define unsafe_put_user(x, ptr, err_label) \
do { \
int
__pu_err; \
__typeof__(
*
(ptr)) __pu_val
=
(x); \
__put_user_size(__pu_val, (ptr), sizeof(
*
(ptr)), __pu_err,
-
EFAULT); \
if
(unlikely(__pu_err)) goto err_label; \
}
while
(
0
)
/
*
*
*
access_ok:
-
Checks
if
a user space pointer
is
valid
*
@
type
:
Type
of access:
%
VERIFY_READ
or
%
VERIFY_WRITE. Note that
*
%
VERIFY_WRITE
is
a superset of
%
VERIFY_READ
-
if
it
is
safe
*
to write to a block, it
is
always safe to read
from
it.
*
@addr: User space pointer to start of block to check
*
@size: Size of block to check
*
*
Context: User context only. This function may sleep
if
pagefaults are
*
enabled.
*
*
Checks
if
a pointer to a block of memory
in
user space
is
valid.
*
*
Returns true (nonzero)
if
the memory block may be valid, false (zero)
*
if
it
is
definitely invalid.
*
*
Note that, depending on architecture, this function probably just
*
checks that the pointer
is
in
the user space
range
-
after calling
*
this function, memory access functions may still
return
-
EFAULT.
*
/
#define access_ok(type, addr, size) \
({ \
WARN_ON_IN_IRQ(); \
likely(!__range_not_ok(addr, size, user_addr_max())); \
})
/
*
*
*
access_ok:
-
Checks
if
a user space pointer
is
valid
*
@
type
:
Type
of access:
%
VERIFY_READ
or
%
VERIFY_WRITE. Note that
*
%
VERIFY_WRITE
is
a superset of
%
VERIFY_READ
-
if
it
is
safe
*
to write to a block, it
is
always safe to read
from
it.
*
@addr: User space pointer to start of block to check
*
@size: Size of block to check
*
*
Context: User context only. This function may sleep
if
pagefaults are
*
enabled.
*
*
Checks
if
a pointer to a block of memory
in
user space
is
valid.
*
*
Returns true (nonzero)
if
the memory block may be valid, false (zero)
*
if
it
is
definitely invalid.
*
*
Note that, depending on architecture, this function probably just
*
checks that the pointer
is
in
the user space
range
-
after calling
*
this function, memory access functions may still
return
-
EFAULT.
*
/
#define access_ok(type, addr, size) \
({ \
WARN_ON_IN_IRQ(); \
likely(!__range_not_ok(addr, size, user_addr_max())); \
})
static
int
__setup_frame(
int
sig, struct ksignal
*
ksig, sigset_t
*
set
, struct pt_regs
*
regs)
{
struct sigframe __user
*
frame;
void __user
*
restorer;
int
err
=
0
;
void __user
*
fpstate
=
NULL;
frame
=
get_sigframe(&ksig
-
>ka, regs, sizeof(
*
frame), &fpstate);
if
(!access_ok(VERIFY_WRITE, frame, sizeof(
*
frame)))
return
-
EFAULT;
......
}
static
int
__setup_frame(
int
sig, struct ksignal
*
ksig, sigset_t
*
set
, struct pt_regs
*
regs)
{
struct sigframe __user
*
frame;
void __user
*
restorer;
int
err
=
0
;
void __user
*
fpstate
=
NULL;
frame
=
get_sigframe(&ksig
-
>ka, regs, sizeof(
*
frame), &fpstate);
if
(!access_ok(VERIFY_WRITE, frame, sizeof(
*
frame)))
return
-
EFAULT;
......
}
SYSCALL_DEFINE5(waitid,
int
, which, pid_t, upid, struct siginfo __user
*
,
infop,
int
, options, struct rusage __user
*
, ru)
{
struct rusage r;
struct waitid_info info
=
{.status
=
0
};
long
err
=
kernel_waitid(which, upid, &info, options, ru ? &r : NULL);
int
signo
=
0
;
if
(err >
0
) {
signo
=
SIGCHLD;
err
=
0
;
}
if
(!err) {
if
(ru && copy_to_user(ru, &r, sizeof(struct rusage)))
return
-
EFAULT;
}
if
(!infop)
return
err;
user_access_begin();
/
/
暂时关闭smap
unsafe_put_user(signo, &infop
-
>si_signo, Efault);
unsafe_put_user(
0
, &infop
-
>si_errno, Efault);
unsafe_put_user((short)info.cause, &infop
-
>si_code, Efault);
unsafe_put_user(info.pid, &infop
-
>si_pid, Efault);
unsafe_put_user(info.uid, &infop
-
>si_uid, Efault);
unsafe_put_user(info.status, &infop
-
>si_status, Efault);
user_access_end();
/
/
重新开启smap
return
err;
Efault:
user_access_end();
return
-
EFAULT;
}
SYSCALL_DEFINE5(waitid,
int
, which, pid_t, upid, struct siginfo __user
*
,
infop,
int
, options, struct rusage __user
*
, ru)
{
struct rusage r;
struct waitid_info info
=
{.status
=
0
};
long
err
=
kernel_waitid(which, upid, &info, options, ru ? &r : NULL);
int
signo
=
0
;
if
(err >
0
) {
signo
=
SIGCHLD;
err
=
0
;
}
if
(!err) {
if
(ru && copy_to_user(ru, &r, sizeof(struct rusage)))
return
-
EFAULT;
}
if
(!infop)
return
err;
user_access_begin();
/
/
暂时关闭smap
unsafe_put_user(signo, &infop
-
>si_signo, Efault);
unsafe_put_user(
0
, &infop
-
>si_errno, Efault);
unsafe_put_user((short)info.cause, &infop
-
>si_code, Efault);
unsafe_put_user(info.pid, &infop
-
>si_pid, Efault);
unsafe_put_user(info.uid, &infop
-
>si_uid, Efault);
unsafe_put_user(info.status, &infop
-
>si_status, Efault);
user_access_end();
/
/
重新开启smap
return
err;
Efault:
user_access_end();
return
-
EFAULT;
}
struct cred {
......
kuid_t uid;
/
*
real UID of the task
*
/
kgid_t gid;
/
*
real GID of the task
*
/
......
}
struct cred {
......
kuid_t uid;
/
*
real UID of the task
*
/
kgid_t gid;
/
*
real GID of the task
*
/
......
}
fork()
_do_fork()
copy_process()
/
*
*
Ensure that the cgroup subsystem policies allow the new process to be
*
forked. It should be noted the the new process's css_set can be changed
*
between here
and
cgroup_post_fork()
if
an organisation operation
is
in
*
progress.
*
/
retval
=
cgroup_can_fork(p);
/
/
判断cgroup是否允许新的进程被fork?
fork()
_do_fork()
copy_process()
/
*
*
Ensure that the cgroup subsystem policies allow the new process to be
*
forked. It should be noted the the new process's css_set can be changed
*
between here
and
cgroup_post_fork()
if
an organisation operation
is
in
[注意]传递专业知识、拓宽行业人脉——看雪讲师团队等你加入!