首页
社区
课程
招聘
[原创]Linux启动流程初探(V:6.12.32)
发表于: 2025-11-6 05:18 903

[原创]Linux启动流程初探(V:6.12.32)

2025-11-6 05:18
903

版本说明: Linux 6.12.32

架构说明: x86_64

这个文章其实是因为看 arttnba3 大佬的文章的时候看见了一个泄漏地址小trick也就是 physmem_base + 0x9d000 存放着secondary_startup_64指针,从而泄漏 kernel base.但是搜索了很多文章好像都只是提及了一下这个小trick但是没说明原因。然后在询问了 Tplus 大佬后,知道了这个板块是realmode相关的,所以从头开始进行了分析。最终结合自己的理解推测了一下这个trick的原因,当然受限于本人的水平,可能这个推断并不正确。(如果有问题请各位大佬帮忙指正一下)

想要快速看这个问题的答案可以直接跳转到最后的总结,因为本文很多过程都是针对于Linux启动过程中的内存变化,可能略显啰嗦(可能有点点偏离主题)。

参考文章:

img

Real Address Mode (实模式):在此模式下地址访问的是真实地内存地址所在位置。

在 realmode 模式下,数据地址是根据 ds:x 和 es:x (ES是一个“额外”的数据段寄存器。当需要同时访问两个不同的数据段时,DS用于一个,ES就用于另一个。)来进行访问的

比如 ds 为 0x7c0 ,si为0则

指令地址是根据 cs:x 寄存器进行寻址的

Protected Address Mode (保护模式):采用虚拟内存,页等机制对内存进行保护。

电源接通时,CPU的寄存器值为:

CS : 0xfffff000 即此时代码执行的地址是0xfffffff0 (BIOS程序所在的ROM区域)

BIOS 的固件程序会将硬盘启动区 512 B的数据原封不动复制给 0x7c00 这个位置,并且跳转到 0x7c00

最后两字节也就是检查第一扇区的最后两字节 boot Signature 和 Magic Number 分别是否为0x55 和 0xAA (01010101 10101010这种交叉数据最容易检查传输是否错误)

或许你会疑惑为什么是 0x7c00 ,这里可以看一下这个文章:f8dK9s2c8@1M7s2y4Q4x3@1q4Q4x3V1k6Q4x3V1k6T1L8r3!0Y4i4K6u0W2j5%4y4V1L8W2)9J5k6h3&6W2N6q4)9J5c8X3S2T1N6i4S2A6j5h3!0X3k6h3W2Q4x3V1k6S2M7Y4c8A6j5$3I4W2i4K6u0r3k6r3g2@1j5h3W2D9M7#2)9J5c8U0p5K6y4o6f1^5y4K6V1H3z5b7`.`.

当 BIOS 执行完毕,就会将 bootloader 复制给 0x7c00 处

并且 BIOS 会将控制权给 bootloader

在早期版本下 bootloader 一般是 Linux 的 boot/bootsect.s 中定义的

在后面一般 bootloader 单独存在的软件

对于 x86/x86-64 架构:

对于 ARM 架构:

一般来说 GRUB 是最常见的 bootloader,其实 bootloader 干的事情基本一致,所以我们主要分析 GRUB

下文主要是引用的这个文章感兴趣可以直接看原文:591K9s2c8@1M7s2y4Q4x3@1q4Q4x3V1k6Q4x3V1k6^5K9h3&6I4K9i4g2Q4x3X3g2Y4K9i4c8T1L8$3!0C8M7#2)9J5k6h3W2G2i4K6u0r3L8r3W2F1N6i4S2Q4x3X3c8A6L8Y4y4A6k6r3g2K6i4K6u0V1j5$3&6Q4x3V1k6U0L8$3&6@1k6h3&6@1i4K6u0r3b7X3!0G2N6r3W2F1k6#2)9J5c8X3I4A6L8Y4g2^5i4K6u0V1j5X3!0G2N6s2y4@1M7X3q4H3i4K6u0V1x3g2)9J5k6h3S2@1L8h3H3`.

grub_main 初始化控制台,计算模块基地址,设置 root 设备,读取 grub 配置文件,加载模块。最后,将 GRUB 置于 normal 模式,在这个模式中,grub_normal_execute (from grub-core/normal/main.c) 将被调用以完成最后的准备工作,然后显示一个菜单列出所用可用的操作系统。(这个界面我们就比较熟悉了对吧,比如进入网吧我们就会看见这个界面)

根据 arch/x86/boot/header.S 定义,可以知道 setup 一般在 0x90000,code32_start一般在 0x100000

总结

当 bootloader 完成工作后,我们的 kernel setup代码(realmode)就被加载进入了内存中了

然后 bootloader 会将执行权限交给 kernel (也就是 arch/x86/boot/header.S _start 函数开始执行)

vmlinuz是压缩后的kernel,于是可以将它分为两部分

setup.bin 程序的入口为__start

执行 arch/x86/boot/header.S __start代码

_start 函数开始之前,还有很多的代码,一般这些代码就是 kenerl 自带的 bootloader (现在已经不再使用,只是会输出一些错误信息)

Start 后紧跟着是一个 jmp 指令(0xeb) 这里相当于是跳转到 start_of_setup

这里算得上 Linux 真正执行的第一个代码

主要做的事情就是:

初始化段寄存器、设置堆栈、代码段规范化、验证设置签名、清空BSS段

然后就是跳转到 main 函数(arch/x86/boot/main.c)

在初始化后各个段寄存器的值应该是

start_of_setup 设置了一些寄存器初始化后,就跳转到了 main 函数 (arch/x86/boot/main.c)

主要是完成一些初始化操作后,进入 Protect Mode

在正式切换到 protected mode 之前的内存准备工作,主要是将内核的 struct setup_header hdr 信息拷贝到 boot_params 中 hdr 去

引导加载程序(如GRUB)与Linux内核之间传递启动参数的关键数据结构。

boot_params 的三次传递

结合参数protected_mode_jump(boot_params.hdr.code32_start,(u32)&boot_params + (ds() << 4)); 一起分析

传参规范(传统寄存器传参):

参数1 → %eax

参数2 → %edx

参数3 → %ebx

参数4 → %ecx

参数5 → %esi

参数6 → %edi

Page Map Level 4

Page Directory Pointer Table

核心功能主要是完成了内核的解压,随后跳转到解压后的内核

从 secondary_startup_64 开始执行

回到最初我们的问题.那么就得探究一下realmode trampoline在内存中的加载

setup_arch(command_line) 阶段为我们的切换程序留内存运行空间为后续启动过程中的代码预留好空间

在系统内存低端(1MB以下)为实模式蹦床代码预留内存区域,确保多处理器启动时APs有可用的实模式执行环境。

这个文章从历史角度也分析了 realmode code的内存加载位置:d26K9s2c8@1M7s2y4Q4x3@1q4Q4x3V1k6Q4x3V1k6D9K9h3&6#2P5r3E0W2M7X3&6W2L8q4)9J5k6h3!0J5k6#2)9J5k6h3y4F1i4K6u0r3k6r3!0U0i4K6u0r3K9s2c8E0L8q4)9J5c8X3I4S2N6r3g2K6N6q4)9J5c8X3q4J5j5$3S2Q4x3V1k6^5z5o6k6Q4x3V1k6T1L8$3!0@1i4K6u0W2K9s2c8E0L8l9`.`.

memblock分配器的行为:

那么我们分析一下前面出现的内存占用情况就能很快估算出来内存大概率应该存在的地址

Initcall 机制可以详细看知识库中的 initcall 机制板块,这里就代表着 do_init_real_mode 会在 kernel_init_freeable 的时候通过 do_pre_smp_initcalls 自动调用 (do_fork 启动一个 进程 执行 kernel_init 函数. PID 为 1 的进程. —> kernel_init_freeable函数 + run_init_process)

然后根据 x86_platform赋值找到对应函数

realmode_init

流程图

那么这里总结一下,也就是说因为 realmode trampoline 在 Linux 内核启动后物理位置固定在 0x90000,然后trampoline_header->start 的位置相对于 realmode trampoline的位置也是固定的,因为setup_real_mode将secondary_startup_64存放在了trampoline_header->start,所以一般情况下 0x9d000 就固定存放着secondary_startup_64指针。

void main(void)
{
        // 初始化默认I/O操作
        init_default_io_ops();
        // 启动协议块拷贝到 boot_params 的hdr字段
        /* First, copy the boot header into the "zeropage" */
        copy_boot_params();
 
        /* Initialize the early-boot console */
        console_init();
        if (cmdline_find_option_bool("debug"))
                puts("early console in setup code\n");
 
        /* End of heap check */
        // 初始化全局堆
        init_heap();
        // 确保 CPU 正在运行在正确的特权级上
        /* Make sure we have all the proper CPU support */
        if (validate_cpu()) {
                puts("Unable to boot - please use a kernel appropriate for your CPU.\n");
                die();
        }
        /* Tell the BIOS what CPU mode we intend to run in */
        set_bios_mode();
 
        // 从 BIOS 获取内存布局信息
        /* Detect memory layout */
        detect_memory();
 
        // 键盘初始化,设置按键检测频率
        /* Set keyboard repeat rate (why?) and query the lock flags */
        keyboard_init();
 
        /* Query Intel SpeedStep (IST) information */
        query_ist();
 
        /* Query APM information */
#if defined(CONFIG_APM) || defined(CONFIG_APM_MODULE)
        query_apm_bios();
#endif
 
        /* Query EDD information */
#if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE)
        query_edd();
#endif
 
        /* Set the video mode */
        set_video();
        // 转移到保护模式
        /* Do the last things and invoke protected mode */
        go_to_protected_mode();
}
void main(void)
{
        // 初始化默认I/O操作
        init_default_io_ops();
        // 启动协议块拷贝到 boot_params 的hdr字段
        /* First, copy the boot header into the "zeropage" */
        copy_boot_params();
 
        /* Initialize the early-boot console */
        console_init();
        if (cmdline_find_option_bool("debug"))
                puts("early console in setup code\n");
 
        /* End of heap check */
        // 初始化全局堆
        init_heap();
        // 确保 CPU 正在运行在正确的特权级上
        /* Make sure we have all the proper CPU support */
        if (validate_cpu()) {
                puts("Unable to boot - please use a kernel appropriate for your CPU.\n");
                die();
        }
        /* Tell the BIOS what CPU mode we intend to run in */
        set_bios_mode();
 
        // 从 BIOS 获取内存布局信息
        /* Detect memory layout */
        detect_memory();
 
        // 键盘初始化,设置按键检测频率
        /* Set keyboard repeat rate (why?) and query the lock flags */
        keyboard_init();
 
        /* Query Intel SpeedStep (IST) information */
        query_ist();
 
        /* Query APM information */
#if defined(CONFIG_APM) || defined(CONFIG_APM_MODULE)
        query_apm_bios();
#endif
 
        /* Query EDD information */
#if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE)
        query_edd();
#endif
 
        /* Set the video mode */
        set_video();
        // 转移到保护模式
        /* Do the last things and invoke protected mode */
        go_to_protected_mode();
}
/*
 * Copy the header into the boot parameter block.  Since this
 * screws up the old-style command line protocol, adjust by
 * filling in the new-style command line pointer instead.
 */
static void copy_boot_params(void)
{
        struct old_cmdline {
                u16 cl_magic;
                u16 cl_offset;
        };
        const struct old_cmdline * const oldcmd = absolute_pointer(OLD_CL_ADDRESS);
 
        BUILD_BUG_ON(sizeof(boot_params) != 4096);
        memcpy(&boot_params.hdr, &hdr, sizeof(hdr));
 
        if (!boot_params.hdr.cmd_line_ptr && oldcmd->cl_magic == OLD_CL_MAGIC) {
                /* Old-style command line protocol */
                u16 cmdline_seg;
 
                /*
                 * Figure out if the command line falls in the region
                 * of memory that an old kernel would have copied up
                 * to 0x90000...
                 */
                if (oldcmd->cl_offset < boot_params.hdr.setup_move_size)
                        cmdline_seg = ds();
                else
                        cmdline_seg = 0x9000;
 
                boot_params.hdr.cmd_line_ptr = (cmdline_seg << 4) + oldcmd->cl_offset;
        }
}
/*
 * Copy the header into the boot parameter block.  Since this
 * screws up the old-style command line protocol, adjust by
 * filling in the new-style command line pointer instead.
 */
static void copy_boot_params(void)
{
        struct old_cmdline {
                u16 cl_magic;
                u16 cl_offset;
        };
        const struct old_cmdline * const oldcmd = absolute_pointer(OLD_CL_ADDRESS);
 
        BUILD_BUG_ON(sizeof(boot_params) != 4096);
        memcpy(&boot_params.hdr, &hdr, sizeof(hdr));
 
        if (!boot_params.hdr.cmd_line_ptr && oldcmd->cl_magic == OLD_CL_MAGIC) {
                /* Old-style command line protocol */
                u16 cmdline_seg;
 
                /*
                 * Figure out if the command line falls in the region
                 * of memory that an old kernel would have copied up
                 * to 0x90000...
                 */
                if (oldcmd->cl_offset < boot_params.hdr.setup_move_size)
                        cmdline_seg = ds();
                else
                        cmdline_seg = 0x9000;
 
                boot_params.hdr.cmd_line_ptr = (cmdline_seg << 4) + oldcmd->cl_offset;
        }
}
/* The so-called "zeropage" */
struct boot_params {
        struct screen_info screen_info;                 /* 0x000 */
        struct apm_bios_info apm_bios_info;             /* 0x040 */
        __u8  _pad2[4];                                 /* 0x054 */
        __u64  tboot_addr;                              /* 0x058 */
        struct ist_info ist_info;                       /* 0x060 */
        __u64 acpi_rsdp_addr;                           /* 0x070 */
        __u8  _pad3[8];                                 /* 0x078 */
        __u8  hd0_info[16];     /* obsolete! */         /* 0x080 */
        __u8  hd1_info[16];     /* obsolete! */         /* 0x090 */
        struct sys_desc_table sys_desc_table; /* obsolete! */   /* 0x0a0 */
        struct olpc_ofw_header olpc_ofw_header;         /* 0x0b0 */
        __u32 ext_ramdisk_image;                        /* 0x0c0 */
        __u32 ext_ramdisk_size;                         /* 0x0c4 */
        __u32 ext_cmd_line_ptr;                         /* 0x0c8 */
        __u8  _pad4[112];                               /* 0x0cc */
        __u32 cc_blob_address;                          /* 0x13c */
        struct edid_info edid_info;                     /* 0x140 */
        struct efi_info efi_info;                       /* 0x1c0 */
        __u32 alt_mem_k;                                /* 0x1e0 */
        __u32 scratch;          /* Scratch field! */    /* 0x1e4 */
        __u8  e820_entries;                             /* 0x1e8 */
        __u8  eddbuf_entries;                           /* 0x1e9 */
        __u8  edd_mbr_sig_buf_entries;                  /* 0x1ea */
        __u8  kbd_status;                               /* 0x1eb */
        __u8  secure_boot;                              /* 0x1ec */
        __u8  _pad5[2];                                 /* 0x1ed */
        /*
         * The sentinel is set to a nonzero value (0xff) in header.S.
         *
         * A bootloader is supposed to only take setup_header and put
         * it into a clean boot_params buffer. If it turns out that
         * it is clumsy or too generous with the buffer, it most
         * probably will pick up the sentinel variable too. The fact
         * that this variable then is still 0xff will let kernel
         * know that some variables in boot_params are invalid and
         * kernel should zero out certain portions of boot_params.
         */
        __u8  sentinel;                                 /* 0x1ef */
        __u8  _pad6[1];                                 /* 0x1f0 */
        struct setup_header hdr;    /* setup header */  /* 0x1f1 */
        __u8  _pad7[0x290-0x1f1-sizeof(struct setup_header)];
        __u32 edd_mbr_sig_buffer[EDD_MBR_SIG_MAX];      /* 0x290 */
        struct boot_e820_entry e820_table[E820_MAX_ENTRIES_ZEROPAGE]; /* 0x2d0 */
        __u8  _pad8[48];                                /* 0xcd0 */
        struct edd_info eddbuf[EDDMAXNR];               /* 0xd00 */
        __u8  _pad9[276];                               /* 0xeec */
} __attribute__((packed));
/* The so-called "zeropage" */
struct boot_params {
        struct screen_info screen_info;                 /* 0x000 */
        struct apm_bios_info apm_bios_info;             /* 0x040 */
        __u8  _pad2[4];                                 /* 0x054 */
        __u64  tboot_addr;                              /* 0x058 */
        struct ist_info ist_info;                       /* 0x060 */
        __u64 acpi_rsdp_addr;                           /* 0x070 */
        __u8  _pad3[8];                                 /* 0x078 */
        __u8  hd0_info[16];     /* obsolete! */         /* 0x080 */
        __u8  hd1_info[16];     /* obsolete! */         /* 0x090 */
        struct sys_desc_table sys_desc_table; /* obsolete! */   /* 0x0a0 */
        struct olpc_ofw_header olpc_ofw_header;         /* 0x0b0 */
        __u32 ext_ramdisk_image;                        /* 0x0c0 */
        __u32 ext_ramdisk_size;                         /* 0x0c4 */
        __u32 ext_cmd_line_ptr;                         /* 0x0c8 */
        __u8  _pad4[112];                               /* 0x0cc */
        __u32 cc_blob_address;                          /* 0x13c */
        struct edid_info edid_info;                     /* 0x140 */
        struct efi_info efi_info;                       /* 0x1c0 */
        __u32 alt_mem_k;                                /* 0x1e0 */
        __u32 scratch;          /* Scratch field! */    /* 0x1e4 */
        __u8  e820_entries;                             /* 0x1e8 */
        __u8  eddbuf_entries;                           /* 0x1e9 */
        __u8  edd_mbr_sig_buf_entries;                  /* 0x1ea */
        __u8  kbd_status;                               /* 0x1eb */
        __u8  secure_boot;                              /* 0x1ec */
        __u8  _pad5[2];                                 /* 0x1ed */
        /*
         * The sentinel is set to a nonzero value (0xff) in header.S.
         *
         * A bootloader is supposed to only take setup_header and put
         * it into a clean boot_params buffer. If it turns out that
         * it is clumsy or too generous with the buffer, it most
         * probably will pick up the sentinel variable too. The fact
         * that this variable then is still 0xff will let kernel
         * know that some variables in boot_params are invalid and
         * kernel should zero out certain portions of boot_params.
         */
        __u8  sentinel;                                 /* 0x1ef */
        __u8  _pad6[1];                                 /* 0x1f0 */
        struct setup_header hdr;    /* setup header */  /* 0x1f1 */
        __u8  _pad7[0x290-0x1f1-sizeof(struct setup_header)];
        __u32 edd_mbr_sig_buffer[EDD_MBR_SIG_MAX];      /* 0x290 */
        struct boot_e820_entry e820_table[E820_MAX_ENTRIES_ZEROPAGE]; /* 0x2d0 */
        __u8  _pad8[48];                                /* 0xcd0 */
        struct edd_info eddbuf[EDDMAXNR];               /* 0xd00 */
        __u8  _pad9[276];                               /* 0xeec */
} __attribute__((packed));
/*
 * Actual invocation sequence
 */
void go_to_protected_mode(void)
{
        /* Hook before leaving real mode, also disables interrupts */
        realmode_switch_hook();
 
        // 开启 A20 地址线,此时有能力访问所有的地址空间
        /* Enable the A20 gate */
        if (enable_a20()) {
                puts("A20 gate not responding, unable to boot...\n");
                die();
        }
 
        // 重置处理器
        /* Reset coprocessor (IGNNE#) */
        reset_coprocessor();
         
        /* Mask all interrupts in the PIC */
        mask_all_interrupts();
 
        // 设置 IDT 和 GDT
        /* Actual transition to protected mode... */
        setup_idt();
        setup_gdt();
        /* 跳转到code32_start处,它的地址位于启动协议块的头部 */
        protected_mode_jump(boot_params.hdr.code32_start,
                            (u32)&boot_params + (ds() << 4));
}
/*
 * Actual invocation sequence
 */
void go_to_protected_mode(void)
{
        /* Hook before leaving real mode, also disables interrupts */
        realmode_switch_hook();
 
        // 开启 A20 地址线,此时有能力访问所有的地址空间
        /* Enable the A20 gate */
        if (enable_a20()) {
                puts("A20 gate not responding, unable to boot...\n");
                die();
        }
 
        // 重置处理器
        /* Reset coprocessor (IGNNE#) */
        reset_coprocessor();
         
        /* Mask all interrupts in the PIC */
        mask_all_interrupts();
 
        // 设置 IDT 和 GDT
        /* Actual transition to protected mode... */
        setup_idt();
        setup_gdt();
        /* 跳转到code32_start处,它的地址位于启动协议块的头部 */
        protected_mode_jump(boot_params.hdr.code32_start,
                            (u32)&boot_params + (ds() << 4));
}
// 在启动过程中的调用顺序:
start_kernel()
    → setup_arch()        
        → reserve_real_mode()  // [给 realmode 代码留内存空间]
    → ... 其他初始化 ...
    → rest_init()     
        → kernel_init()
            → kernel_init_freeable()
                → do_basic_setup()
                    → do_initcalls()
                        → init_real_mode()  // 通过early_initcall注册
// 在启动过程中的调用顺序:
start_kernel()
    → setup_arch()        
        → reserve_real_mode()  // [给 realmode 代码留内存空间]
    → ... 其他初始化 ...
    → rest_init()     
        → kernel_init()
            → kernel_init_freeable()
                → do_basic_setup()
                    → do_initcalls()
                        → init_real_mode()  // 通过early_initcall注册
void __init setup_arch(char **cmdline_p)
{
    ...
    /*
     * 为实模式跳转程序(real mode trampoline)寻找空闲内存并放置。
     * 若 1MB 以下空间无足够空闲内存,在启用 EFI 的系统上,
     * 会在 efi_free_boot_services() 阶段再次尝试回收内存以分配给实模式跳转程序。
     *
     * 无条件保留物理内存的前 1MB —— 原因是已知 BIOS 会破坏低地址内存,
     * 而这几百 KB 的空间不值得通过复杂检测来判断哪些内存会被篡改。
     * Windows 也因类似原因采用了相同的策略。
     *
     * 此外,在搭载 SandyBridge 核显的设备或启用 crashkernel(崩溃内核)的配置中,
     * 前 1MB 内存本就会被自动保留。
     *
     * 注意:支持 TDX(Trust Domain Extensions)的宿主内核也要求保留前 1MB 内存。
     */
    x86_platform.realmode_reserve();
    ...
}
/*
        reserve_real_mode():在内存低端(<1MB)保留实模式代码运行所需的内存区域
        为AP(Application Processors,即非BSP处理器)启动代码提供运行环境
*/
void __init reserve_real_mode(void)
{
        phys_addr_t mem;
        size_t size = real_mode_size_needed();
 
        if (!size)
                return;
 
        WARN_ON(slab_is_available());
 
        /* Has to be under 1M so we can execute real-mode AP code. */
        mem = memblock_phys_alloc_range(size, PAGE_SIZE, 0, 1<<20);
        if (!mem)
                pr_info("No sub-1M memory is available for the trampoline\n");
        else
                set_real_mode_mem(mem);
 
        /*
         * Unconditionally reserve the entire first 1M, see comment in
         * setup_arch().
         */
        memblock_reserve(0, SZ_1M);
}
void __init setup_arch(char **cmdline_p)
{
    ...
    /*
     * 为实模式跳转程序(real mode trampoline)寻找空闲内存并放置。
     * 若 1MB 以下空间无足够空闲内存,在启用 EFI 的系统上,
     * 会在 efi_free_boot_services() 阶段再次尝试回收内存以分配给实模式跳转程序。
     *
     * 无条件保留物理内存的前 1MB —— 原因是已知 BIOS 会破坏低地址内存,
     * 而这几百 KB 的空间不值得通过复杂检测来判断哪些内存会被篡改。
     * Windows 也因类似原因采用了相同的策略。
     *
     * 此外,在搭载 SandyBridge 核显的设备或启用 crashkernel(崩溃内核)的配置中,
     * 前 1MB 内存本就会被自动保留。
     *
     * 注意:支持 TDX(Trust Domain Extensions)的宿主内核也要求保留前 1MB 内存。
     */
    x86_platform.realmode_reserve();
    ...
}
/*
        reserve_real_mode():在内存低端(<1MB)保留实模式代码运行所需的内存区域
        为AP(Application Processors,即非BSP处理器)启动代码提供运行环境
*/
void __init reserve_real_mode(void)
{
        phys_addr_t mem;
        size_t size = real_mode_size_needed();
 
        if (!size)
                return;
 
        WARN_ON(slab_is_available());
 
        /* Has to be under 1M so we can execute real-mode AP code. */
        mem = memblock_phys_alloc_range(size, PAGE_SIZE, 0, 1<<20);
        if (!mem)
                pr_info("No sub-1M memory is available for the trampoline\n");
        else
                set_real_mode_mem(mem);
 
        /*
         * Unconditionally reserve the entire first 1M, see comment in
         * setup_arch().
         */
        memblock_reserve(0, SZ_1M);
}
// arch/x86/realmode/init.c:49
 
void __init reserve_real_mode(void)
{
    phys_addr_t mem;
    size_t size = real_mode_size_needed();  // 通常4KB左右
     
    // 关键:在 0 到 1MB 范围内分配
    mem = memblock_phys_alloc_range(size, PAGE_SIZE, 0, 1<<20);
    //                                              ↑   ↑
    //                                            起始 结束(1MB)
}
// arch/x86/realmode/init.c:49
 
void __init reserve_real_mode(void)
{
    phys_addr_t mem;
    size_t size = real_mode_size_needed();  // 通常4KB左右
     
    // 关键:在 0 到 1MB 范围内分配
    mem = memblock_phys_alloc_range(size, PAGE_SIZE, 0, 1<<20);
    //                                              ↑   ↑
    //                                            起始 结束(1MB)
}
static inline void set_real_mode_mem(phys_addr_t mem)
{
        real_mode_header = (struct real_mode_header *) __va(mem);
}
static inline void set_real_mode_mem(phys_addr_t mem)
{
        real_mode_header = (struct real_mode_header *) __va(mem);
}
static int __init do_init_real_mode(void)
{
        x86_platform.realmode_init();
        return 0;
}
early_initcall(do_init_real_mode);
static int __init do_init_real_mode(void)
{
        x86_platform.realmode_init();
        return 0;
}
early_initcall(do_init_real_mode);
struct x86_platform_ops x86_platform __ro_after_init = {
        .calibrate_cpu                  = native_calibrate_cpu_early,
        .calibrate_tsc                  = native_calibrate_tsc,
        .get_wallclock                  = mach_get_cmos_time,
        .set_wallclock                  = mach_set_cmos_time,
        .iommu_shutdown                 = iommu_shutdown_noop,
        .is_untracked_pat_range         = is_ISA_range,
        .nmi_init                       = default_nmi_init,
        .get_nmi_reason                 = default_get_nmi_reason,
        .save_sched_clock_state         = tsc_save_sched_clock_state,
        .restore_sched_clock_state      = tsc_restore_sched_clock_state,
        .realmode_reserve               = reserve_real_mode,
        .realmode_init                  = init_real_mode,
        .hyper.pin_vcpu                 = x86_op_int_noop,
        .hyper.is_private_mmio          = is_private_mmio_noop,
 
        .guest = {
                .enc_status_change_prepare = enc_status_change_prepare_noop,
                .enc_status_change_finish  = enc_status_change_finish_noop,
                .enc_tlb_flush_required    = enc_tlb_flush_required_noop,
                .enc_cache_flush_required  = enc_cache_flush_required_noop,
                .enc_kexec_begin           = enc_kexec_begin_noop,
                .enc_kexec_finish          = enc_kexec_finish_noop,
        },
};
struct x86_platform_ops x86_platform __ro_after_init = {
        .calibrate_cpu                  = native_calibrate_cpu_early,
        .calibrate_tsc                  = native_calibrate_tsc,
        .get_wallclock                  = mach_get_cmos_time,
        .set_wallclock                  = mach_set_cmos_time,
        .iommu_shutdown                 = iommu_shutdown_noop,
        .is_untracked_pat_range         = is_ISA_range,
        .nmi_init                       = default_nmi_init,
        .get_nmi_reason                 = default_get_nmi_reason,
        .save_sched_clock_state         = tsc_save_sched_clock_state,
        .restore_sched_clock_state      = tsc_restore_sched_clock_state,
        .realmode_reserve               = reserve_real_mode,
        .realmode_init                  = init_real_mode,
        .hyper.pin_vcpu                 = x86_op_int_noop,
        .hyper.is_private_mmio          = is_private_mmio_noop,
 
        .guest = {
                .enc_status_change_prepare = enc_status_change_prepare_noop,
                .enc_status_change_finish  = enc_status_change_finish_noop,
                .enc_tlb_flush_required    = enc_tlb_flush_required_noop,
                .enc_cache_flush_required  = enc_cache_flush_required_noop,
                .enc_kexec_begin           = enc_kexec_begin_noop,
                .enc_kexec_finish          = enc_kexec_finish_noop,
        },
};
void __init init_real_mode(void)
{
        if (!real_mode_header)
                panic("Real mode trampoline was not allocated");
 
        setup_real_mode();
        set_real_mode_permissions();
}
void __init init_real_mode(void)
{
        if (!real_mode_header)
                panic("Real mode trampoline was not allocated");
 
        setup_real_mode();
        set_real_mode_permissions();
}
do_init_real_mode()
    init_real_mode()
        setup_real_mode() [初始化 realmode]
            
do_init_real_mode()
    init_real_mode()
        setup_real_mode() [初始化 realmode]
            
static void __init setup_real_mode(void)
{
        ...
        size_t size = PAGE_ALIGN(real_mode_blob_end - real_mode_blob);
#ifdef CONFIG_X86_64
        u64 *trampoline_pgd;
        u64 efer;
        int i;
#endif
 
        base = (unsigned char *)real_mode_header;
 
        /*
         * If SME is active, the trampoline area will need to be in
         * decrypted memory in order to bring up other processors
         * successfully. This is not needed for SEV.
         */
        if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
                set_memory_decrypted((unsigned long)base, size >> PAGE_SHIFT);
        // 二进制代码复制到对应内存区域
        memcpy(base, real_mode_blob, size);
        ...
}
static void __init setup_real_mode(void)
{
        ...
        size_t size = PAGE_ALIGN(real_mode_blob_end - real_mode_blob);
#ifdef CONFIG_X86_64
        u64 *trampoline_pgd;
        u64 efer;
        int i;
#endif
 
        base = (unsigned char *)real_mode_header;

[培训]传播安全知识、拓宽行业人脉——看雪讲师团队等你加入!

收藏
免费 7
支持
分享
最新回复 (2)
雪    币: 180
能力值: ( LV1,RANK:0 )
在线值:
发帖
回帖
粉丝
2
大佬 带带俺
2025-11-6 12:01
0
雪    币: 313
能力值: ( LV1,RANK:0 )
在线值:
发帖
回帖
粉丝
3
牛!!!,大佬再搞个Windows 的
6天前
0
游客
登录 | 注册 方可回帖
返回