首页
社区
课程
招聘
[原创]Linux内核[CVE-2017-5123] (waitid)原理分析
发表于: 2021-1-13 13:18 13113

[原创]Linux内核[CVE-2017-5123] (waitid)原理分析

2021-1-13 13:18
13113

cve-2017-5123 是linux内核中waitid有关的一个漏洞。

类似于wait和waitpid。

https://elixir.bootlin.com/linux/v4.13/source/kernel/exit.c

看到这里首先调用 user_access_begin()

实际上是调用了 stac() 还有 clac()

简单说一下这个CLAC 和 STAC,由于众所周知的SMAP,内核不可以随意去读写用户空间的数据(这也导致了内核对于用户空间访问的低效)。但是当 内核确实有正常的需求需要访问用户空间并进行读写时,我们需要暂时禁用smap

However, the use of SMAP in an operating system may lead to a larger kernel size and slower user-space memory accesses from supervisor code, because SMAP must be temporarily disabled any time supervisor code intends to access user-space memory.

而谁来disable SMAP呢?这就需要:Extended Features CPUID leaf 也就是 EFLAGS

下面给出过程:

1.当 EFLAGS.ACstac 设置后,就可以暂时性的disable掉SMAP,然后内核便可以对用户数据进行读写。

2.内核读写完毕用户数据,此时我们要恢复SMAP,于是调用 clac 清除 EFLAGS.AC 恢复SMAP。整个过程便完成了。

接下来调用了 unsafe_put_user

这里首先提醒我们,要配合 user_access_begin/end() 来使用 unsafe_put_user

并且,要在这之前调用 access_ok() 检查!!!但是这里并没有 access_ok()

这里做了如下事情:

1.user_addr_max() 获得了 current->thread.addr_limit.seg 作为用户态地址的边界。

2.__chk_user_ptr 检查我们的参数 addr 是否是指向用户态

3.__range_not_ok 检查 addr + sizelimit 的大小关系,即addr + size 是否也指向用户态

我们再找一个x86下对于 access_ok 的引用:

这里检查frame的地址是否是用户态的。setup_frame 是在做信号处理的时候,需要先返回到用户态执行信号处理程序,执行完信号处理程序后再返回到内核态,再在内核态完成收尾工作。而从内核态返回到用户态时,CPU要从内核栈中找到返回到用户态的地址(就是调用系统调用的下一条代码指令地址)Linux为了先让信号处理程序执行,所以就需要把这个返回地址修改为信号处理程序的入口,这样当从系统调用返回到用户态时,就可以执行信号处理程序了。

当用户态的signal handler执行完毕后,重新切回内核态。(通过sigreturn() 系统调用,在 sigreturn() 中恢复原来内核栈的内容,这里也是由于 sigreturn 没有对数据做检查就弹到寄存器里而产生了 SROP 这种攻击)

之后进行内核态的收尾处理。

我们再根据源码来梳理一下:

1.由于没有在调用 unsafe_put_user 前调用 access_ok 导致将 unsafe_put_user(x, ptr, err_label) 其中的x写到了任意的ptr。

2.更进一步的,针对 unsafe_put_user(0, &infop->si_errno, Efault) 我们可以写一个空字节到一个可控的任意地址!

3.如果我们将空字节向 cred.uid 写的话,就可以完成提权。

4.但是我们没有一个特别通用的方法去在进程空间内找到他的cred结构体的位置。

朴素思想:此漏洞允许无特权的用户在调用waitid()时通过使用infop来指定内核地址,内核将会不带检查的对其进行写入。

infop 是一个 struct siginfo __user *

下面谈几种可用的攻击方式:

1.堆喷:大量的进行fork()创建进程,每个进程都会对应一个cred结构体,然后任意写某一个进程cred的uid,之后 getuid() 检测是否有哪一个进程的uid被清零(提权)

2.ret2dir:首先找到用户区域和内核区域对应的physmap的地址,在physmap中写payload,然后找到内核对应的physmap的虚拟地址,最后把内核态的执行流拉到内核对应的physmap地址上。

3.通过爆破struct file 的地址,然后找到file结构体中指向当前的cred结构体的指针,接下来直接任意写当前的cred结构体。(之前做祥云杯的内核pwn的时候感觉在不开启kaslr下爆破file结构体比直接爆破cred结构体要快?)

4.在 exploit_null_ptr_deref 分析笔记) 中 De4dCr0w师傅 提到了,可以利用覆写have_canfork_callback触发空指针引用fork()提权

考虑如下调用流:

而在 cgroup_can_fork 中:

调用了 do_each_subsys_mask

这里的 ss_mask 就是 have_canfork_callbackss 是未初始化的 struct cgroup_subsys 指针。CGROUP_SUBSYS_COUNT 为 0.

ssid 为 i。

这里将我们的 have_canfork_callback 的地址作为addr(指向位图)。bit为 ssid 即 i,size为 CGROUP_SUBSYS_COUNT (查找范围)。

Linux Kernel Inside 中对其的解释如下:

All of these macros provide iterator over certain set of bits in a bit array. The first macro iterates over bits which are set, the second does the same, but starts from a certain bits. The last two macros do the same, but iterates over clear bits.

也就是说 for_each_set_bit 迭代的是那些被置位的bits。

具体的,find_first_bit 在位图中查找第一个为1的bit位。

find_next_bit 在查找范围内,从bit+1开始,接着找第一个为1的bit位。

所以这里 for_each_set_bit 就是在范围内,查找所有的被置位的bit。返回的是位图 have_canfork_callback 中小于 CGROUP_SUBSYS_COUNT 的最后一个被置位的bit的位置(在上层函数中就是ssid)。

然后将其作为数组下标,获取 cgroup_subsys[ssid] 处的值赋给 ss

最后在cgroup_can_fork 会有一个针对ret = ss->can_fork(child) 的调用。其中cgroup_subsys 是一个虚表如下:

这张表在创建之初是的,也就是说 ss->can_fork 为NULL。

这个rdi就是 have_canfork_callback

此时本次 find_first_bit 返回值如下:

接下来做了一个cmp,如果返回值小于等于0xc,那么就跳转。调用偏移0x50的 can_fork

如果我们让 can_fork 为0,然后去调用这个 can_fork 那么就实现了控制流向0地址的一个转移,然后我们将事先准备好的shellcode放到0地址,就可以在0地址上执行shellcode。

这里发生了这样的问题:

mmap的时候:

这个低地址不可映射。。。最终爆炸。。

我尝试了如下方法,无果:

https://blog.csdn.net/cosmoslhf/article/details/39101999

但是我发现如果我用root 启动内核,那么这个0地址映射是成功的。而这个 cap_capable 在检查进程的权限。

同时这篇文章:https://www.cnblogs.com/redstar9451/p/6645579.html 也提到了,要在root下做0地址映射。

在这里提到了:https://blog.csdn.net/airuozhaoyang/article/details/99300030

由于内核空间和用户空间共享虚拟内存地址,因此需要防止用户空间mmap的内存从0开始,从而缓解NULL解引用攻击。windows系统从win8开始禁止在零页分配内存。从linux内核2.6.22开始可以使用sysctl设置mmap_min_addr来实现这一保护。从Ubuntu 9.04开始,mmap_min_addr设置被内置到内核中(x86为64k,ARM为32k)

这个似乎是一个保护措施,更详细的在mmap_min_addr

但是我尝试使用上面的措施来关闭他,但是失败了。

但是我看网上用4.13.0复现这个cve的wp没有提到这个问题,于是为了学习漏洞利用的过程,我们切到root启动再继续。(如果有知道怎么解决这个问题的欢迎指导!)

首先我们断点打在这里

pwndbg> b cgroup_can_fork
Breakpoint 1 at 0xffffffff811264a0: file kernel/cgroup/cgroup.c, line 4812.

但是问题在于在我这个4.13的版本下,CGROUP_SUBSYS_COUNT的默认值并不是4,而是0xc;并且在一个新的cgroup_subsys中的can_fork函数也不是0,而是已经被定义好的 pids_can_fork。

这个初始化出现在:https://elixir.bootlin.com/linux/v4.13/source/kernel/cgroup/pids.c#L343

这个版本的exp在我这里调用失败了。原因如下:

1.非root用户无法做0地址映射。

2.cgroup_subsys会直接被初始化,导致can_fork不为0

不过这里给出几个用0地址的exp成功的师傅的博客:

https://bbs.pediy.com/thread-247014.htm

https://freewechat.com/a/MjM5NTc2MDYxMw==/2458292173/1

https://x3h1n.github.io/2019/12/30/CVE-2017-5123%E5%A4%8D%E7%8E%B0/

图片描述
这里对应的在:
图片描述

条件:

1.已知我们可以通过 unsafe_put_user 将 0 写入内核任意位置。

2.如果我们知道某一个cred结构体的位置,那么直接去写他的uid和euid,就可以实现提权。

3.waitpid在非法内存访问时不会崩溃,而是返回错误代码,基于此,也可以进行内存的爆破or探测。(-EFAULT)

方法:

1.使用clone函数创建多个轻量级process,那么内核中会存在许多的cred结构体。

2.观察每个cred结构体euid的位置。可以通过如下驱动:

https://reverse.put.as/2017/11/07/exploiting-cve-2017-5123/

可以看到偏移还是在某一个范围内的。

效果:

其中 cve-2017-5123是一个权限400的文件。内容是字符串success。

可以看到此时已经劫持了euid。

此时euid为0,等同于root。此时读出了root权限的文件。

ps:至于为什么在劫持了euid后要setuid可以看:https://www.douban.com/note/310087353/

waitpid(2) - Linux man page

patch:waitid(): switch copyout of siginfo to unsafe_put_user()

Supervisor Mode Access Prevention

Linux信号机制

Kernel exploitation - CVE-2017-5123 PoC e Writeup

Exploiting CVE-2017-5123 with full protections. SMEP, SMAP, and the Chrome Sandbox!

Bit arrays and bit operations in the Linux kernel

linux内核追踪——find_next_bit函数详详详解

CVE-2017-5123复现)

Exploiting CVE-2017-5123

 
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/wait.h>
#include <string.h>
 
int main(){
 
    siginfo_t info;
    id_t id;
    int ret_val;
 
    pid_t pid = fork();
    if(pid < 0){
        perror("fork failed");
        exit(-1);
    }
 
    else if(pid == 0){
        // in child process
        printf("This is child process!\n");
        exit(8);
    }
    else{
        // in father process
        memset(&info, 0 , sizeof(siginfo_t));
 
 
        //P_ALL代表等待任何一个进程
        //退出信息在info结构体中
        //Wait for children that have terminated.
        ret_val = waitid(P_ALL, id , &info , WEXITED);
        if(ret_val < 0){
                perror("waitid failed");exit(-2);
        }
        if(info.si_code == CLD_EXITED)
        {
            printf("si_code: _exit\n");
        }      
        printf("si_status = %d\n" , info.si_status);
    }
    return 0;
}
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/wait.h>
#include <string.h>
 
int main(){
 
    siginfo_t info;
    id_t id;
    int ret_val;
 
    pid_t pid = fork();
    if(pid < 0){
        perror("fork failed");
        exit(-1);
    }
 
    else if(pid == 0){
        // in child process
        printf("This is child process!\n");
        exit(8);
    }
    else{
        // in father process
        memset(&info, 0 , sizeof(siginfo_t));
 
 
        //P_ALL代表等待任何一个进程
        //退出信息在info结构体中
        //Wait for children that have terminated.
        ret_val = waitid(P_ALL, id , &info , WEXITED);
        if(ret_val < 0){
                perror("waitid failed");exit(-2);
        }
        if(info.si_code == CLD_EXITED)
        {
            printf("si_code: _exit\n");
        }      
        printf("si_status = %d\n" , info.si_status);
    }
    return 0;
}
typedef struct siginfo {
    int si_signo;
    int si_errno;
    int si_code;
 
    union {
        int _pad[SI_PAD_SIZE];
 
        /* kill() */
        struct {
            __kernel_pid_t _pid;    /* sender's pid */
            __ARCH_SI_UID_T _uid;    /* sender's uid */
        } _kill;
 
        /* POSIX.1b timers */
        struct {
            __kernel_timer_t _tid;    /* timer id */
            int _overrun;        /* overrun count */
            char _pad[sizeof( __ARCH_SI_UID_T) - sizeof(int)];
            sigval_t _sigval;    /* same as below */
            int _sys_private;       /* not to be passed to user */
        } _timer;
 
        /* POSIX.1b signals */
        struct {
            __kernel_pid_t _pid;    /* sender's pid */
            __ARCH_SI_UID_T _uid;    /* sender's uid */
            sigval_t _sigval;
        } _rt;
 
        /* SIGCHLD */
        struct {
            __kernel_pid_t _pid;    /* which child */
            __ARCH_SI_UID_T _uid;    /* sender's uid */
            int _status;        /* exit code */
            __ARCH_SI_CLOCK_T _utime;
            __ARCH_SI_CLOCK_T _stime;
        } _sigchld;
 
        /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
        struct {
            void __user *_addr; /* faulting insn/memory ref. */
#ifdef __ARCH_SI_TRAPNO
            int _trapno;    /* TRAP # which caused the signal */
#endif
            short _addr_lsb; /* LSB of the reported address */
            union {
                /* used when si_code=SEGV_BNDERR */
                struct {
                    void __user *_lower;
                    void __user *_upper;
                } _addr_bnd;
                /* used when si_code=SEGV_PKUERR */
                __u32 _pkey;
            };
        } _sigfault;
 
        /* SIGPOLL */
        struct {
            __ARCH_SI_BAND_T _band;    /* POLL_IN, POLL_OUT, POLL_MSG */
            int _fd;
        } _sigpoll;
 
        /* SIGSYS */
        struct {
            void __user *_call_addr; /* calling user insn */
            int _syscall;    /* triggering system call number */
            unsigned int _arch;    /* AUDIT_ARCH_* of syscall */
        } _sigsys;
    } _sifields;
} __ARCH_SI_ATTRIBUTES siginfo_t;
typedef struct siginfo {
    int si_signo;
    int si_errno;
    int si_code;
 
    union {
        int _pad[SI_PAD_SIZE];
 
        /* kill() */
        struct {
            __kernel_pid_t _pid;    /* sender's pid */
            __ARCH_SI_UID_T _uid;    /* sender's uid */
        } _kill;
 
        /* POSIX.1b timers */
        struct {
            __kernel_timer_t _tid;    /* timer id */
            int _overrun;        /* overrun count */
            char _pad[sizeof( __ARCH_SI_UID_T) - sizeof(int)];
            sigval_t _sigval;    /* same as below */
            int _sys_private;       /* not to be passed to user */
        } _timer;
 
        /* POSIX.1b signals */
        struct {
            __kernel_pid_t _pid;    /* sender's pid */
            __ARCH_SI_UID_T _uid;    /* sender's uid */
            sigval_t _sigval;
        } _rt;
 
        /* SIGCHLD */
        struct {
            __kernel_pid_t _pid;    /* which child */
            __ARCH_SI_UID_T _uid;    /* sender's uid */
            int _status;        /* exit code */
            __ARCH_SI_CLOCK_T _utime;
            __ARCH_SI_CLOCK_T _stime;
        } _sigchld;
 
        /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
        struct {
            void __user *_addr; /* faulting insn/memory ref. */
#ifdef __ARCH_SI_TRAPNO
            int _trapno;    /* TRAP # which caused the signal */
#endif
            short _addr_lsb; /* LSB of the reported address */
            union {
                /* used when si_code=SEGV_BNDERR */
                struct {
                    void __user *_lower;
                    void __user *_upper;
                } _addr_bnd;
                /* used when si_code=SEGV_PKUERR */
                __u32 _pkey;
            };
        } _sigfault;
 
        /* SIGPOLL */
        struct {
            __ARCH_SI_BAND_T _band;    /* POLL_IN, POLL_OUT, POLL_MSG */
            int _fd;
        } _sigpoll;
 
        /* SIGSYS */
        struct {
            void __user *_call_addr; /* calling user insn */
            int _syscall;    /* triggering system call number */
            unsigned int _arch;    /* AUDIT_ARCH_* of syscall */
        } _sigsys;
    } _sifields;
} __ARCH_SI_ATTRIBUTES siginfo_t;
SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *,
        infop, int, options, struct rusage __user *, ru)
{
    struct rusage r;
    struct waitid_info info = {.status = 0};
    long err = kernel_waitid(which, upid, &info, options, ru ? &r : NULL);
    int signo = 0;
    if (err > 0) {
        signo = SIGCHLD;
        err = 0;
    }
 
    if (!err) {
        if (ru && copy_to_user(ru, &r, sizeof(struct rusage)))
            return -EFAULT;
    }
    if (!infop)
        return err;
 
    user_access_begin();        //暂时关闭smap
    unsafe_put_user(signo, &infop->si_signo, Efault);
    unsafe_put_user(0, &infop->si_errno, Efault);
    unsafe_put_user((short)info.cause, &infop->si_code, Efault);
    unsafe_put_user(info.pid, &infop->si_pid, Efault);
    unsafe_put_user(info.uid, &infop->si_uid, Efault);
    unsafe_put_user(info.status, &infop->si_status, Efault);
    user_access_end();          //重新开启smap
    return err;
Efault:
    user_access_end();
    return -EFAULT;
}
SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *,
        infop, int, options, struct rusage __user *, ru)
{
    struct rusage r;
    struct waitid_info info = {.status = 0};
    long err = kernel_waitid(which, upid, &info, options, ru ? &r : NULL);
    int signo = 0;
    if (err > 0) {
        signo = SIGCHLD;
        err = 0;
    }
 
    if (!err) {
        if (ru && copy_to_user(ru, &r, sizeof(struct rusage)))
            return -EFAULT;
    }
    if (!infop)
        return err;
 
    user_access_begin();        //暂时关闭smap
    unsafe_put_user(signo, &infop->si_signo, Efault);
    unsafe_put_user(0, &infop->si_errno, Efault);
    unsafe_put_user((short)info.cause, &infop->si_code, Efault);
    unsafe_put_user(info.pid, &infop->si_pid, Efault);
    unsafe_put_user(info.uid, &infop->si_uid, Efault);
    unsafe_put_user(info.status, &infop->si_status, Efault);
    user_access_end();          //重新开启smap
    return err;
Efault:
    user_access_end();
    return -EFAULT;
}
#define user_access_begin()    __uaccess_begin()
#define user_access_end()    __uaccess_end()
 
#define __uaccess_begin() stac()
#define __uaccess_end()   clac()
#define user_access_begin()    __uaccess_begin()
#define user_access_end()    __uaccess_end()
 
#define __uaccess_begin() stac()
#define __uaccess_end()   clac()
#ifdef CONFIG_X86_SMAP    //是否开始smap?
 
static __always_inline void clac(void)
{
    /* Note: a barrier is implicit in alternative() */
    alternative("", __stringify(__ASM_CLAC), X86_FEATURE_SMAP);
}
 
static __always_inline void stac(void)
{
    /* Note: a barrier is implicit in alternative() */
    alternative("", __stringify(__ASM_STAC), X86_FEATURE_SMAP);
}
 
//其中 __ASM_CLAC 与 __ASM_STAC
 
/* "Raw" instruction opcodes */
#define __ASM_CLAC    .byte 0x0f,0x01,0xca
#define __ASM_STAC    .byte 0x0f,0x01,0xcb
#ifdef CONFIG_X86_SMAP    //是否开始smap?
 
static __always_inline void clac(void)
{
    /* Note: a barrier is implicit in alternative() */
    alternative("", __stringify(__ASM_CLAC), X86_FEATURE_SMAP);
}
 
static __always_inline void stac(void)
{
    /* Note: a barrier is implicit in alternative() */
    alternative("", __stringify(__ASM_STAC), X86_FEATURE_SMAP);
}
 
//其中 __ASM_CLAC 与 __ASM_STAC
 
/* "Raw" instruction opcodes */
#define __ASM_CLAC    .byte 0x0f,0x01,0xca
#define __ASM_STAC    .byte 0x0f,0x01,0xcb
 
 
 
 
 
 
/*
 * The "unsafe" user accesses aren't really "unsafe", but the naming
 * is a big fat warning: you have to not only do the access_ok()
 * checking before using them, but you have to surround them with the
 * user_access_begin/end() pair.
 */
#define user_access_begin()    __uaccess_begin()
#define user_access_end()    __uaccess_end()
 
#define unsafe_put_user(x, ptr, err_label)                    \
do {                                        \
    int __pu_err;                                \
    __typeof__(*(ptr)) __pu_val = (x);                    \
    __put_user_size(__pu_val, (ptr), sizeof(*(ptr)), __pu_err, -EFAULT);    \
    if (unlikely(__pu_err)) goto err_label;                    \
} while (0)
/*
 * The "unsafe" user accesses aren't really "unsafe", but the naming
 * is a big fat warning: you have to not only do the access_ok()
 * checking before using them, but you have to surround them with the
 * user_access_begin/end() pair.
 */
#define user_access_begin()    __uaccess_begin()
#define user_access_end()    __uaccess_end()
 
#define unsafe_put_user(x, ptr, err_label)                    \
do {                                        \
    int __pu_err;                                \
    __typeof__(*(ptr)) __pu_val = (x);                    \
    __put_user_size(__pu_val, (ptr), sizeof(*(ptr)), __pu_err, -EFAULT);    \
    if (unlikely(__pu_err)) goto err_label;                    \
} while (0)
 
/**
 * access_ok: - Checks if a user space pointer is valid
 * @type: Type of access: %VERIFY_READ or %VERIFY_WRITE.  Note that
 *        %VERIFY_WRITE is a superset of %VERIFY_READ - if it is safe
 *        to write to a block, it is always safe to read from it.
 * @addr: User space pointer to start of block to check
 * @size: Size of block to check
 *
 * Context: User context only. This function may sleep if pagefaults are
 *          enabled.
 *
 * Checks if a pointer to a block of memory in user space is valid.
 *
 * Returns true (nonzero) if the memory block may be valid, false (zero)
 * if it is definitely invalid.
 *
 * Note that, depending on architecture, this function probably just
 * checks that the pointer is in the user space range - after calling
 * this function, memory access functions may still return -EFAULT.
 */
#define access_ok(type, addr, size)                    \
({                                    \
    WARN_ON_IN_IRQ();                        \
    likely(!__range_not_ok(addr, size, user_addr_max()));        \
})
/**
 * access_ok: - Checks if a user space pointer is valid
 * @type: Type of access: %VERIFY_READ or %VERIFY_WRITE.  Note that
 *        %VERIFY_WRITE is a superset of %VERIFY_READ - if it is safe
 *        to write to a block, it is always safe to read from it.
 * @addr: User space pointer to start of block to check
 * @size: Size of block to check
 *
 * Context: User context only. This function may sleep if pagefaults are
 *          enabled.
 *
 * Checks if a pointer to a block of memory in user space is valid.
 *
 * Returns true (nonzero) if the memory block may be valid, false (zero)
 * if it is definitely invalid.
 *
 * Note that, depending on architecture, this function probably just
 * checks that the pointer is in the user space range - after calling
 * this function, memory access functions may still return -EFAULT.
 */
#define access_ok(type, addr, size)                    \
({                                    \
    WARN_ON_IN_IRQ();                        \
    likely(!__range_not_ok(addr, size, user_addr_max()));        \
})
 
 
 
 
static int
__setup_frame(int sig, struct ksignal *ksig, sigset_t *set, struct pt_regs *regs)
{
    struct sigframe __user *frame;
    void __user *restorer;
    int err = 0;
    void __user *fpstate = NULL;
 
    frame = get_sigframe(&ksig->ka, regs, sizeof(*frame), &fpstate);
 
    if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame)))
        return -EFAULT;
    ......
}
static int
__setup_frame(int sig, struct ksignal *ksig, sigset_t *set, struct pt_regs *regs)
{
    struct sigframe __user *frame;
    void __user *restorer;
    int err = 0;
    void __user *fpstate = NULL;
 
    frame = get_sigframe(&ksig->ka, regs, sizeof(*frame), &fpstate);
 
    if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame)))
        return -EFAULT;
    ......
}
 
 
 
SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *,
        infop, int, options, struct rusage __user *, ru)
{
    struct rusage r;
    struct waitid_info info = {.status = 0};
    long err = kernel_waitid(which, upid, &info, options, ru ? &r : NULL);
    int signo = 0;
    if (err > 0) {
        signo = SIGCHLD;
        err = 0;
    }
 
    if (!err) {
        if (ru && copy_to_user(ru, &r, sizeof(struct rusage)))
            return -EFAULT;
    }
    if (!infop)
        return err;
 
    user_access_begin();        //暂时关闭smap
    unsafe_put_user(signo, &infop->si_signo, Efault);
    unsafe_put_user(0, &infop->si_errno, Efault);
    unsafe_put_user((short)info.cause, &infop->si_code, Efault);
    unsafe_put_user(info.pid, &infop->si_pid, Efault);
    unsafe_put_user(info.uid, &infop->si_uid, Efault);
    unsafe_put_user(info.status, &infop->si_status, Efault);
    user_access_end();          //重新开启smap
    return err;
Efault:
    user_access_end();
    return -EFAULT;
}
SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *,
        infop, int, options, struct rusage __user *, ru)
{
    struct rusage r;
    struct waitid_info info = {.status = 0};
    long err = kernel_waitid(which, upid, &info, options, ru ? &r : NULL);
    int signo = 0;
    if (err > 0) {
        signo = SIGCHLD;
        err = 0;
    }
 
    if (!err) {
        if (ru && copy_to_user(ru, &r, sizeof(struct rusage)))
            return -EFAULT;
    }
    if (!infop)
        return err;
 
    user_access_begin();        //暂时关闭smap
    unsafe_put_user(signo, &infop->si_signo, Efault);
    unsafe_put_user(0, &infop->si_errno, Efault);
    unsafe_put_user((short)info.cause, &infop->si_code, Efault);
    unsafe_put_user(info.pid, &infop->si_pid, Efault);
    unsafe_put_user(info.uid, &infop->si_uid, Efault);
    unsafe_put_user(info.status, &infop->si_status, Efault);
    user_access_end();          //重新开启smap
    return err;
Efault:
    user_access_end();
    return -EFAULT;
}
 
 
 
struct cred {
    ......
    kuid_t        uid;        /* real UID of the task */
    kgid_t        gid;        /* real GID of the task */
    ......
}
struct cred {
    ......
    kuid_t        uid;        /* real UID of the task */
    kgid_t        gid;        /* real GID of the task */
    ......
}
 
 
 
 
 
 
fork()
    _do_fork()
        copy_process()
                /*
                 * Ensure that the cgroup subsystem policies allow the new process to be
                 * forked. It should be noted the the new process's css_set can be changed
                 * between here and cgroup_post_fork() if an organisation operation is in
                 * progress.
                 */
                retval = cgroup_can_fork(p);        //判断cgroup是否允许新的进程被fork?
fork()
    _do_fork()
        copy_process()
                /*
                 * Ensure that the cgroup subsystem policies allow the new process to be
                 * forked. It should be noted the the new process's css_set can be changed
                 * between here and cgroup_post_fork() if an organisation operation is in

[培训]内核驱动高级班,冲击BAT一流互联网大厂工作,每周日13:00-18:00直播授课

最后于 2021-1-13 13:55 被Roland_编辑 ,原因:
收藏
免费 9
支持
分享
最新回复 (1)
雪    币: 0
活跃值: (16)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
2
楼主请问下linux内核的调试环境你是如何搭建的,参考网上vmware 双机方式有点卡
2022-3-27 14:02
0
游客
登录 | 注册 方可回帖
返回
// // 统计代码