《深入解析windows操作系统第6版下册》第10章:内存管理（第四部分）-外文翻译-看雪-安全社区|安全招聘|kanxue.com

最新回复 (27) 1 2 ▶
shayi 雪币： 1604 活跃值： (640) 能力值： ( LV13，RANK：460 ) 在线值：发帖 23 回帖 269 粉丝 52 关注私信	shayi 9 2 楼 The sizes of these bit fields are dictated by the structures they reference. For example, the byte offset is 12 bits because it denotes a byte within a page, and pages are 4,096 bytes (2 ^ 12 = 4,096). The other indexes are 10 bits because the structures they index have 1,024 entries (2 ^ 10 = 1,024). 这些位域的大小取决于它们所引用的结构。比如，“字节偏移”域是 12 位，因为该字段需要指示出页面内的一个字节，而页大小一般为 4,096 字节（2 ^ 12 = 4,096）。（译注：按作者意思，12 位才可能够涵盖一个页中所有的字节）其它的索引字段则是 10 位，因为它们索引的结构中，有 1,024 个条目（2 ^ 10 = 1,024）（译注：即，一个 PD[页目录] 中，有 1,024 个 PDE[页目录项]，每个 PDE 索引一个 PT[页表]；一个 PT 中，有 1,024 个 PTE[页表项]，每个 PTE 索引一个范围 4,096 字节的页面） The job of virtual address translation is to convert these virtual addresses into physical addresses— that is, addresses of locations in RAM. The format of a physical address on an x86 non-PAE system is shown in Figure 10-17. 虚拟地址翻译，这项工作就是把虚拟地址转换成物理地址——亦即，RAM（随机访问存储器）中位置的地址。在非 PAE 的 x86 系统上，物理地址的格式如图 10-17 所示。 As you can see, the format is very similar to that of a virtual address. Furthermore, the byte offset value from a virtual address will be the same in the resulting physical address. We can say, then, that address translation involves converting virtual page numbers to physical page numbers (also referred to as page frame numbers, or PFNs). The byte offset does not participate in, and does not change as a result of, address translation. It is simply copied from the virtual address to the physical address, 如你所见，它的格式与虚拟地址非常相似。此外，转换产生的物理地址中的“字节偏移”值，将与原虚拟地址中的相同。如此一来，我们可以说，地址翻译涉及把虚拟页号转换成物理页号（亦称为“页框号”，或 PFN）。事实上，“字节偏移”并不参与转换过程，也不因地址转换的结果而改变——只是简单地把它从虚拟地址复制到物理地址中。 Figure 10-18 shows the relationship of these three values and how they are used to perform address translation. 图 10-18 描绘了这三个值（字段）之间的关系，以及它们如何被用来执行地址翻译。 The following basic steps are involved in translating a virtual address: 1. The memory management unit (MMU) uses a privileged CPU register, CR3, to obtain the physical address of the page directory. 2. The page directory index portion of the virtual address is used as an index into the page directory. This locates the page directory entry (PDE) that contains the location of the page table needed to map the virtual address. The PDE in turn contains the physical page number, also called the page frame number, or PFN, of the desired page table, provided the page table is resident—page tables can be paged out or not yet created, and in those cases, the page table is first made resident before proceeding. If a flag in the PDE indicates that it describes a large page, then it simply contains the PFN of the target large page, and the rest of the virtual address is treated as the byte offset within the large page.　 3. The page table index is used as an index into the page table to locate the PTE that describes the virtual page in question. 4. If the PTE’s valid bit is clear, this triggers a page fault (memory management fault). The operating system’s memory management fault handler (pager) locates the page and tries to make it valid; after doing so, this sequence continues at step 5. (See the section “Page Fault Handling.”) If the page cannot or should not be made valid (for example, because of a protection fault), the fault handler generates an access violation or a bug check. 5. When the PTE describes a valid page (whether immediately or after page fault resolution), the desired physical address is constructed from the PFN field of the PTE, followed by the byte offset field from the original virtual address. Now that you have the overall picture, let’s look at the detailed structure of page directories, page tables, and PTEs. 翻译一个虚拟地址涉及到以下基本步骤： 1。内存管理单元使用一个叫做 CR3 的特权 CPU 寄存器，来获取页目录的物理地址。 2。虚拟地址中的“页目录索引”部分被用于索引页目录中的“项”（PDE）。这就能够找到特定的 PDE ，它含有映射虚拟地址必需的页表位置。换言之，PDE 中包含：所期望页表的物理页号，或称页框号，或 PFN——在该页表驻留物理内存的条件下—— 页表可以被换出物理内存，或根本尚未创建，在后两种情况中，首先要让该页表驻留在物理内存中，才能进行后续处理。如果 PDE 中的一个标志指出它描述的是一个大页，那它就包含目标大页的 PFN，而虚拟地址的其余部分被视为该大页内的字节偏移。 3。“页表索引”被用于在页表中，找出负责描述相关虚拟页的 PTE。（译注：如图所示，PTE 包含所期望“页面”的 PFN） 4。如果该 PTE 的有效位被清除，这将触发一次页错误（内存管理故障）。操作系统的内存管理故障处理程序（换页器）会找到该页面，并试图让它有效（请参考“页错误处理”一节）；在此之后，继续步骤 5 中的操作。如果无法或者不应该将此页面设置为有效（比如，由于页保护导致的错误），错误处理程序生成一个非法访问异常，或者 bug check 蓝屏。（译注：即，崩溃在内核模式下） 5。当该 PTE 能够描述一个有效的页面时（无论在当下，还是在分页异常得到解决后），就从该 PTE 的 PFN 字段构建出期望的物理地址，后面紧跟着原始虚拟地址中的“字节偏移”字段。现在你已经掌握了地址翻译的大局，让我们再来研究页目录，页表，以及页表项的细部结构。 2017-6-3 17:35 0
shayi 雪币： 1604 活跃值： (640) 能力值： ( LV13，RANK：460 ) 在线值：发帖 23 回帖 269 粉丝 52 关注私信	shayi 9 3 楼 Page Directories On non-PAE x86 systems, each process has a single page directory, a page the memory manager creates to map the location of all page tables for that process. The physical address of the process page directory is stored in the kernel process (KPROCESS) block, but it is also mapped virtually at address 0xC0300000 on x86 non-PAE systems. (For more detailed information about the KPROCESS and other process data structures, refer to Chapter 5, “Processes, Threads, and Jobs” in Part 1.) 页目录在非 PAE 的 x86 系统上，每个进程都有单一的页目录，这是内存管理器创建来映射该进程所有页表位置的页面。进程页目录的物理地址存储在“内核进程块”（KPROCESS）中，但在 x86 非 PAE 系统上，它也虚拟地映射到地址 0xC0300000 处。（更多有关 KPROCESS 和其它进程数据结构的信息，请参考本书上册第五章“Processes, Threads, and Jobs”） The CPU obtains the location of the page directory from a privileged CPU register called CR3. It contains the page frame number of the page directory. (Since the page directory is itself always page-aligned, the low-order 12 bits of its address are always zero, so there is no need for CR3 to supply these.) Each time a context switch occurs to a thread that is in a different process than that of the currently executing thread, the context switch routine in the kernel loads this register from a field in the KPROCESS block of the new process. Context switches between threads in the same process don’t result in reloading the physical address of the page directory because all threads within the same process share the same process address space and thus use the same page directory and page tables. CPU 从 CR3 寄存器获取页目录的位置。CR3 包含页目录的页框号。（由于页目录自身总是按页对齐的，页目录地址中的低 12 位总是为零，因此 CR3 无需提供“字节偏移”信息。）每次发生上下文切换到一个与当前执行线程所属进程不同的线程时，内核中的环境切换例程从新进程 KPROCESS 块中的一个字段（译注：即 DirectoryTableBase 字段）载入到 CR3 寄存器。同一个进程内的线程间上下文切换，并不会导致重载页目录的物理地址，因为相同进程内的所有线程共享同一个进程地址空间，因而使用相同的页目录和页表。 The page directory is composed of page directory entries (PDEs), each of which is 4 bytes long. The PDEs in the page directory describe the state and location of all the possible page tables for the process. As described later in the chapter, page tables are created on demand, so the page directory for most processes points only to a small set of page tables. (If a page table does not yet exist, the VAD tree is consulted to determine whether an access should materialize it.) The format of a PDE isn’t repeated here because it’s mostly the same as a hardware PTE, which is described shortly. 页目录由页目录项（PDE）组成，每个 PDE 大小为四字节。页目录中的 PDE 描述所有可能用于该进程页表的状态与位置。正如本章稍后所述，页表是按需创建的，因此多数进程的页目录仅指向很小的一组页表。（假设一张页表尚不存在，就会查阅 VAD 树，以确定是否应该实现对它的访问——译者提问：即创建页表？）此处不会重复介绍 PDE 的格式，因为它与硬件 PTE 大致相同，稍后会对其进行讨论。 To describe the full 4-GB virtual address space, 1,024 page tables are required. The process page directory that maps these page tables contains 1,024 PDEs. Therefore, the page directory index needs to be 10 bits wide (2 ^ 10 = 1,024). 为了能够描述整个 4GB 的虚拟地址空间，需要 1,024 张页表。所以，映射这些页表的进程页目录就包含 1,024 个 PDE（译注：每个 PDE 描述一张页表）。因此，页目录索引的宽度必须是 10 位（2 ^ 10 = 1,024）。 EXPERIMENT: Examining the Page Directory and PDEs You can see the physical address of the currently running process’s page directory by examining the DirBase field in the !process kernel debugger output: 实验：查看页目录和 PDE 你可以通过查看内核调试器的“!process”命令输出中的“DirBase”字段，来确认当前运行进程的页目录物理地址： lkd> !process -1 0 PROCESS 857b3528 SessionId: 1 Cid: 0f70 Peb: 7ffdf000 ParentCid: 0818 DirBase: 47c9b000 ObjectTable: b4c56c48 HandleCount: 226. Image: windbg.exe You can see the page directory’s virtual address by examining the kernel debugger output for the PTE of a particular virtual address, as shown here: 你可以通过考察内核调试器为特定虚拟地址的 PTE 输出（执行“!pte”命令），来确认页目录的虚拟地址，如下所示： lkd> !pte 10004 VA 00010004 PDE at C0300000 PTE at C0000040 contains 6F06B867 contains 3EF8C847 pfn 6f06b ---DA--UWEV pfn 3ef8c ---D---UWEV （译注：页目录中第一个“项”的虚拟地址为 0xC0300000，验证了前文的内容；而 PTE 的虚拟地址为 0xC0000040——这两个虚拟地址用来解析在命令中指定的虚拟地址 0x00010004；输出中的第三行是 4 字节的 PDE 与 PTE 内容——你可以看到，前面的 20 位就是它们要描述的对象——页表或页面——的页框号；后 12 位则是目标页表或页面的状态标志，例如，该 PTE 的内容为 0x3EF8C847，这表明它描述的页面在物理内存中的页框号为 0x3EF8C000，后 12 位 0x00000847 的二进制形式被解释为“V”置位，表示该页面有效；“E”置位，表示该页面包含可执行代码；“W”置位，表示该页面可写；“U”置位，表示该页面属于用户模式；“D”置位，表示该页面是“脏页”） The PTE part of the kernel debugger output is defined in the section “Page Tables and Page Table Entries.” We will describe this output further in the section on x86 PAE translation. 上面内核调试器输出中的 PTE 部分，在“页表和页表条目”一节中解释。我们将在“x86 上的 PAE 地址转换”一节中，进一步讨论这个输出的含义。（译注：我又来罗嗦几句了——既然原书作者喜欢卖关子，那我就要把前面讲过的一些知识点联系在一起，并且通过实战来总结，如下图所示，调试机上运行 Windbg.exe ，通过命名管道模拟连接至被调试虚拟机上的 COM 口，后者以调试模式启动，整个命令带上参数可以是这样： Windbg.exe -n -v -logo d:\kernel_realtime_debugging.txt -k com:pipe,port=\\.\pipe\com_1,baud=115200,reconnect 其中，“-logo”选项后接用来记录调试日志的文件路径。下图显示出，被调试机上当前运行着 System 进程，它的页目录物理基地址是 0x00185000，这与 CR3 的当前值一致； System 进程的“执行体进程块”——EPROCESS 结构——的虚拟地址为 0x85534408 ；由于 EPROCESS 结构中的第一个字段“Pcb”就是一个内嵌的 KPROCESS 结构，因此它的地址就是内核进程块的地址，这就是为啥用 dt 命令转储内核进程块的 DirectoryTableBase 字段值时，可以接上 EPROCESS 结构起始虚拟地址的原因。这个字段也指向 0x00185000，上下文切换就涉及把目标进程的 DirectoryTableBase 值装载到 CR3 中） 2017-6-3 17:44 0
shayi 雪币： 1604 活跃值： (640) 能力值： ( LV13，RANK：460 ) 在线值：发帖 23 回帖 269 粉丝 52 关注私信	shayi 9 4 楼 Because Windows provides a private address space for each process, each process has its own page directory and page tables to map that process’s private address space. However, the page tables that describe system space are shared among all processes (and session space is shared only among processes in a session). To avoid having multiple page tables describing the same virtual memory, when a process is created, the page directory entries that describe system space are initialized to point to the existing system page tables. If the process is part of a session, session space page tables are also shared by pointing the session space page directory entries to the existing session page tables. 由于 Windows 为每个进程提供了各自私有的地址空间，每个进程就有自己的页目录和页表来映射自身的私有地址空间。然而，描述系统空间的页表在所有进程间共享（而“会话空间”仅在属于该会话的进程间共享）。为了避免有多个页表描述同一个虚拟内存区域，创建进程时，描述系统空间的页目录项被初始化为指向现有的系统页表。如果进程是会话的一部分，也通过把会话空间页目录项指向现有的会话页表，来共享它们。 Page Tables and Page Table Entries Each page directory entry points to a page table. A page table is a simple array of PTEs. The virtual address’s page table index field (as shown in Figure 10-18) indicates which PTE within the page table corresponds to and describes the data page in question. The page table index is 10 bits wide, allowing you to reference up to 1,024 4-byte PTEs. Of course, because x86 provides a 4-GB virtual address space, more than one page table is needed to map the entire address space. To calculate the number of page tables required to map the entire 4-GB virtual address space, divide 4 GB by the virtual memory mapped by a single page table. Recall that each page table on an x86 system maps 4 MB of data pages. Thus, 1,024 page tables (4 GB / 4 MB) are required to map the full 4-GB address space. This corresponds with the 1,024 entries in the page directory. 页表与页表项每个页目录项都指向一张页表。页表就是一个由 PTE 构成的数组。虚拟地址中的“页表索引”字段（请回顾图 10-18）指出它对应于页表内的哪一个 PTE，而后者描述了与其相关的数据页。“页表索引”的宽度是 10 位，让它最多能够引用 1,024 个四字节长度的 PTE。当然，由于 x86 提供了一个 4GB 的虚拟地址空间，这就需要不止一张页表，才能够映射完整个地址空间。为了计算映射整个 4GB 虚拟地址空间所需的页表数量，可以把 4GB 除以由单张页表映射的虚拟内存范围。回想一下：x86 系统上的每张页表映射 4 MB 范围的数据页。因此，需要 1,024 张页表（4 GB / 4 MB）来映射整个 4GB 地址空间。这就对应了页目录中的 1,024 个“项”（即 PDE）。 You can use the !pte command in the kernel debugger to examine PTEs. (See the experiment “Translating Addresses.”) We’ll discuss valid PTEs here and invalid PTEs in a later section. Valid PTEs have two main fields: the page frame number (PFN) of the physical page containing the data or of the physical address of a page in memory, and some flags that describe the state and protection of the page, as shown in Figure 10-19. 你可以使用内核调试器的“!pte”命令查看 PTE。（参考实验：“地址翻译”）此处我们将讨论有效的 PTE，在后续部分讨论无效的 PTE。有效 PTE 的两个主要字段，即物理页（包含数据）的页框号（PFN），或 RAM 中页面物理地址的页框号，另一个字段则是描述该页面状态和保护信息的一些标志位，如图 10-19 所示。 As you’ll see later, the bits labeled “Software field” and “Reserved” in Figure 10-19 are ignored by the MMU, whether or not the PTE is valid. These bits are stored and interpreted by the memory manager. Table 10-11 briefly describes the hardware-defined bits in a valid PTE. 正如您稍候将要看到的，CPU 的内存管理单元（MMU）会忽略掉图 10-19 中，被标记为“Software field”和 “Reserved”的位，无论这个 PTE 是否有效。图 10-19 中的这些位由内存管理器存储并解释。表 10-11 将一个有效 PTE 中，那些硬件定义的位作了简短介绍。 On x86 systems, a hardware PTE contains two bits that can be changed by the MMU, the Dirty bit and the Accessed bit. The MMU sets the Accessed bit whenever the page is read or written (provided it is not already set). The MMU sets the Dirty bit whenever a write operation occurs to the page. The operating system is responsible for clearing these bits at the appropriate times; they are never cleared by the MMU. 在 x86 系统上，硬件 PTE 包含两个可由 MMU 更改的比特位，即“脏位”和“访问位”。每当该页面被读取或写入时，MMU 就会设置访问位（只要它未被设置）。每当该页面发生一次写操作时，MMU 就会设置脏位。操作系统负责在适当的时间清除这些位；MMU 从不清除它们。 The x86 MMU uses a Write bit to provide page protection. When this bit is clear, the page is readonly; when it is set, the page is read/write. If a thread attempts to write to a page with the Write bit clear, a memory management exception occurs, and the memory manager’s access fault handler (described later in the chapter) must determine whether the thread can be allowed to write to the page (for example, if the page was really marked copy-on-write) or whether an access violation should be generated. x86 MMU 借助“Write bit”来提供页面保护机制。当此位清 0 时，对应的页面是只读的；当此位置 1 时，对应的页面可读写。如果一个线程试图向“Write bit”清 0 的页面写入，就会引发一次内存管理异常，而内存管理器的“访问错误处理程序”（本章稍后会讲到）必须确定：该线程是否被允许写入这个页面（比如，目标页面确实标记了“写时复制”），或者是否应该产生一个“非法访问”异常。 2017-6-3 17:50 0
shayi 雪币： 1604 活跃值： (640) 能力值： ( LV13，RANK：460 ) 在线值：发帖 23 回帖 269 粉丝 52 关注私信	shayi 9 5 楼 Hardware vs. Software Write Bits in Page Table Entries The additional Write bit implemented in software (as mentioned in Table 10-11) is used to force updating of the Dirty bit to be synchronized with updates to Windows memory management data. In a simple implementation, the memory manager would set the hardware Write bit (bit 1) for any writable page, and a write to any such page will cause the MMU to set the Dirty bit in the page table entry. Later, the Dirty bit will tell the memory manager that the contents of that physical page must be written to backing store before the physical page can be used for something else. PTE 中的硬写位（Hardware Write Bits）与软写位（Software Write Bits）在软件中实现的附加“Write bit”（如表 10-11 所示）用来强制更新“脏位”，使它能与 Windows 内存管理器数据的更新同步。在一个简单的实现逻辑中，内存管理器将会给任何可写页面设置硬写位（PTE 中的位 1），而向任何此类页面的写入将导致 MMU 设置此 PTE 中的“脏位”。随后，置 1 的“脏位”将告知内存管理器：必须把该物理页面的内容写入后备存储（在此页面可用于记录其它数据之前）。 In practice, on multiprocessor systems, this can lead to race conditions that are expensive to resolve. The MMUs of the various processors can, at any time, set the Dirty bit of any PTE that has its hardware Write bit set. The memory manager must, at various times, update the process working set list to reflect the state of the Dirty bit in a PTE. The memory manager uses a pushlock to synchronize access to the working set list. But on a multiprocessor system, even while one processor is holding the lock, the Dirty bit might be changed by MMUs of other CPUs. This raises the possibility of missing an update to a Dirty bit. 在实践中，对于多处理系统，这可能导致竞争条件，解决它的代价很昂贵。各处理器的 MMU 们，都可以在任何时候设置任何 PTE 的“脏位”（它们的硬写位已置 1）。相应地，内存管理器就必须在不同时间更新进程的工作集列表，以反映该 PTE 中“脏位”的状态。（译注：有效 PTE 引用的进程页面驻留在物理内存中，它们合称工作集。所以要把对某个 PTE 的修改，更新到它引用的进程页面—— 这就是更新该进程的工作集）内存管理器使用“推锁”来同步对工作集列表的访问。然而在多处理器系统上，即便当一个处理器持有该锁，其它 CPU 的 MMU 们，也可以修改同一个 PTE 的“脏位”。这就可能会错过对“脏位”的相应更新。 To avoid this, the Windows memory manager initializes both read-only and writable pages with the hardware Write bit (bit 1) of their PTEs set to 0 and records the true writable state of the page in the software Write bit (bit 11). On the first write access to such a page, the processor will raise a memory management exception because the hardware Write bit is clear, just as it would be for a true read-only page. In this case, though, the memory manager learns that the page actually is writable (via the software Write bit), acquires the working set pushlock, sets the Dirty bit and the hardware Write bit in the PTE, updates the working set list to note that the page has been changed, releases the working set pushlock, and dismisses the exception. The hardware write operation then proceeds as usual, but the setting of the Dirty bit is made to happen with the working set list pushlock held. 为了避免这种情况，Windows 内存管理器就把描述只读和可写页的那些 PTE 中的硬写位（位 1），都初始化成 0，然后在软写位（位 11）中，记录该页面的实际可写状态。首次向此类页面执行写访问时，处理器将会引发一次内存管理异常，因为硬写位是清除的，仿佛一个真正的只读页那样。然而在这种情况下，内存管理器通过软写位获悉该页面实际上是可写的，它就会取得工作集推锁，设置 PTE 中的“脏位”和硬写位，更新工作集列表，以通知该页面的变动，释放工作集推锁，并且驳回异常。接着一如既往地进行硬件写操作，而在持有工作集锁列表推锁后，就会发生对“脏位”的设置。 On subsequent writes to the page, no exceptions occur because the hardware Write bit is set. The MMU will redundantly set the Dirty bit, but this is benign because the “written-to” state of the page is already recorded in the working set list. Forcing the first write to a page to go through this exception handling may seem to be excessive overhead. However, it happens only once per writable page as long as the page remains valid. Furthermore, the first access to almost any page already goes through memory management exception handling because pages are usually initialized in the invalid state (PTE bit 0 is clear). If the first access to a page is also the first write access to the page, the Dirty bit handling just described will occur within the handling of the first-access page fault, so the additional overhead is small. Finally, on both uniprocessor and multiprocessor systems, this implementation allows flushing of the translation look-aside buffer (described later) without holding a lock for each page being flushed. 后续写入该页面时，就不会出现异常，因为设置了硬写位。而 MMU 还是会再次设置硬写位，尽管没有必要，但也无伤大雅，因为该页面的“写入”状态已经记录在工作集列表中。强制让首次写入页面就经历这个异常处理，看似开销过大。然而，只要页面仍旧有效，它对于每个可写页面只发生一次。此外，对几乎任何页面的首次访问都已经历了内存管理异常处理，因为页面通常初始化为无效状态（ PTE 中的位 1 被清除）。假设首次访问一个页面就执行写操作，处理首次访问页错误时，在其内部会发生前述的“脏位”处理，由此引入的额外开销是很细微的。（译注：整个一劳永逸的流程总结：首次写访问 -> 页不在物理内存中（无效） -> 处理缺页异常 -> 换入后只读 -> 处理写入异常 -> 后续顺畅写入）最后，在单处理器和多处理器系统上，此一实现逻辑都允许在冲刷转换后备缓冲区（即 TLB，稍后讨论）时，无需为每个被冲洗的页面都持有一把锁。 Byte Within Page Once the memory manager has determined the physical page number, it must locate the requested data within that page. This is the purpose of the byte offset field. The byte offset from the original virtual address is simply copied to the corresponding field in the physical address. On x86 systems, the byte offset is 12 bits wide, allowing you to reference up to 4,096 bytes of data (the size of a page). Another way to interpret this is that the byte offset from the virtual address is concatenated to the physical page number retrieved from the PTE. This completes the translation of a virtual address to a physical address. 页内字节一旦内存管理器确定出物理页号，它必须在该页内找到所请求的数据。这就是“字节偏移”域的作用。就是简单地把原始虚拟地址中的“字节偏移”复制到物理地址中的对应字段。在 x86 系统上，“字节偏移”为 12 位宽，这允许你引用最多 4,096 字节的数据（即一张页的大小）另一种解释为：从该 PTE 中检索出物理页号，接着把虚拟地址中的“字节偏移”字段串联其后。这就完成了从虚拟地址到物理地址的转换。 2017-6-3 17:56 0
shayi 雪币： 1604 活跃值： (640) 能力值： ( LV13，RANK：460 ) 在线值：发帖 23 回帖 269 粉丝 52 关注私信	shayi 9 6 楼 Translation Look-Aside Buffer As you’ve learned so far, each hardware address translation requires two lookups: one to find the right entry in the page directory (which provides the location of the page table) and one to find the right entry in the page table. Because doing two additional memory lookups for every reference to a virtual address would triple the required bandwidth to memory, resulting in poor performance, all CPUs cache address translations so that repeated accesses to the same addresses don’t have to be repeatedly translated. This cache is an array of associative memory called the translation look-aside buffer, or TLB. Associative memory is a vector whose cells can be read simultaneously and compared to a target value. In the case of the TLB, the vector contains the virtual-to-physical page mappings of the most recently used pages, as shown in Figure 10-20, and the type of page protection, size, attributes, and so on applied to each page. Each entry in the TLB is like a cache entry whose tag holds portions of the virtual address and whose data portion holds a physical page number, protection field, valid bit, and usually a dirty bit indicating the condition of the page to which the cached PTE corresponds. If a PTE’s global bit is set (as is done by Windows for system space pages that are visible to all processes), the TLB entry isn’t invalidated on process context switches. 转换后援缓冲区正如你迄今所了解的，每个硬件地址翻译要求两次查询：第一次在页目录中找出正确的项（它给出页表的位置），第二次在页表中找到正确的项。由于为每个虚拟地址的引用执行两次额外的内存查找，将需要原本三倍到存储器的带宽，导致性能恶劣，所有 CPU 都会缓存地址翻译的结果，使得重复访问相同地址无需屡次转换。这个缓存是由“相联存储器”构成的数组，它叫做“转换后援缓冲区”，或简称 TLB。相联存储器是一个“向量”，它内部的单元可被同时地读取，并与目标值进行比较。（译注：硬件 TLB 通常集成在 CPU 的一级高速缓存内部，因此不要把作者提到的相联“存储器” 与主存储器/内存混淆，它们有着本质上的区别。）就 TLB 而言，该向量包含最近使用过页面的虚拟-物理页映射，页保护的类型，大小，属性，诸如此类应用至每个页面的信息。。。。如图 10-20 所示。（译注：我说一些废话：既然有区分指令 L1 cache 与数据 L1 cache，它们各自就集成了指令 TLB 与数据 TLB，分别用来缓存某指令所在的虚拟-物理页映射，以及某数据所在的虚拟-物理页映射，从而加速 CPU 取出指令与数据——以 Intel 的“Core”核心架构为例） TLB 中的每一项类似于一个缓存条目，其“标签”部分持有虚拟地址，其“数据”部分持有物理页号，保护字段，有效位，通常还有“脏位”，指出与缓存的 PTE 对应的页面处于何种状况。假设一个 PTE 中的“全局”位被设置了（就像 Windows 为所有进程可见/共享的系统空间页面所设定的那样），相应的 TLB 条目就不会在进程上下文切换期间（译注：根据前文，还有“冲刷”TLB 时）失效。 Virtual addresses that are used frequently are likely to have entries in the TLB, which provides extremely fast virtual-to-physical address translation and, therefore, fast memory access. If a virtual address isn’t in the TLB, it might still be in memory, but multiple memory accesses are needed to find it, which makes the access time slightly slower. If a virtual page has been paged out of memory or if the memory manager changes the PTE, the memory manager is required to explicitly invalidate the TLB entry. If a process accesses it again, a page fault occurs, and the memory manager brings the page back into memory (if needed) and re-creates its PTE entry (which then results in an entry for it in the TLB). 频繁用到的虚拟地址很可能在 TLB 中有对应的条目，这就提供了极快的虚拟-物理地址转换过程，故而快速的内存访问（译注：只需一次物理内存访问即可取出指令或代码，这就是为啥无论在所需时间和带宽上，都只有“循规蹈矩”地址翻译的三分之一。）如果某个虚拟地址在 TLB 中没有对应的项，就可能仍在内存中，而需要多次主存储器访问才能找到，使得访问时间稍慢一些（与页面被换出内存相比）。如果虚拟页已经被换出主存储器，或者内存管理器变更了 PTE，它就需要显式地让相应的 TLB 条目失效。（译注：这很好理解： PTE 描述的物理页变了，原 TLB 中记录的当然是错误的转换结果）当一个进程再次访问已经被换出的虚拟页时，就会发生一次页错误，然后内存管理器把该页面换回内存（如有必要），并重建它的 PTE 项（而后导致在 TLB 中缓存相应的条目）。 2017-6-3 18:01 0
库尔雪币： 1319 活跃值： (1960) 能力值： ( LV2，RANK：10 ) 在线值：发帖 3 回帖 190 粉丝 3 关注私信	库尔 7 楼虽然已经知道内存管理原理，不过还是来看看漏掉什么 2017-6-3 19:37 0
zgzxp 雪币： 578 活跃值： (808) 能力值： ( LV2，RANK：10 ) 在线值：发帖 5 回帖 88 粉丝 0 关注私信	zgzxp 8 楼膜拜中，翻译专业书籍是个巨大的工程啊，谢谢 2017-6-3 21:44 0
fengyunabc 雪币： 3738 活跃值： (3872) 能力值： ( LV4，RANK：50 ) 在线值：发帖 11 回帖 522 粉丝 26 关注私信	fengyunabc 1 9 楼楼主辛苦了！ 2017-6-3 23:06 0
MaYil 雪币： 7012 活跃值： (4222) 能力值： ( LV2，RANK：10 ) 在线值：发帖 1 回帖 259 粉丝 2 关注私信	MaYil 10 楼感谢翻译, 辛苦了 2017-6-4 01:26 0
ugvjewxf 雪币： 615 活跃值： (590) 能力值： ( LV4，RANK：40 ) 在线值：发帖 22 回帖 352 粉丝 11 关注私信	ugvjewxf 11 楼先感谢下，慢慢看，慢慢学，主要学习英文， 2017-6-4 05:44 0
shayi 雪币： 1604 活跃值： (640) 能力值： ( LV13，RANK：460 ) 在线值：发帖 23 回帖 269 粉丝 52 关注私信	shayi 9 12 楼来看看真实世界中的 TLB 在 CPU 封装内是如何布局的，以 Intel Core I7 处理器为例子，如下图所示，可以看到， TLB 也按照指令和数据来划分，而且与硬件高速缓存一样，具有层次结构。注意，L1~L2 TLB 是虚拟寻址；而 L1~L3 Cache 则是物理寻址。CPU 翻译一个虚拟地址涉及发送虚拟页号（VPN）到 MMU，以及发送虚拟页字节偏移（VPO）到 L1 Cache，这两个发送是“并行”开展的，同时查找缓存在 TLB 中的物理页号（PPN）和 L1 Cache 中的物理页字节偏移（PPO）。PPN + PPO 就得出物理地址。视具体情况而定，也有可能像前面译文讲的那样，先从虚拟地址中复制字节偏移到物理地址中相应的字段，再用得出的物理地址到 L1 Cache 中查询有无相应条目缓存。但这个“顺序”查找就比“同时”查找慢了“一些”时钟周期。假设级联式的 TLB 都未命中，或者 Cache 都未命中，就需要到内存中查找，这就会慢“很多”时钟周期。 2017-6-5 20:17 0
shayi 雪币： 1604 活跃值： (640) 能力值： ( LV13，RANK：460 ) 在线值：发帖 23 回帖 269 粉丝 52 关注私信	shayi 9 13 楼 Physical Address Extension (PAE) The Intel x86 Pentium Pro processor introduced a memory-mapping mode called Physical Address Extension (PAE). With the proper chipset, the PAE mode allows 32-bit operating systems access to up to 64 GB of physical memory on current Intel x86 processors (up from 4 GB without PAE) and up to 1,024 GB of physical memory when running on x64 processors in legacy mode (although Windows currently limits this to 64 GB due to the size of the PFN database required to describe so much memory). When the processor is running in PAE mode, the memory management unit (MMU) divides virtual addresses mapped by normal pages into four fields, as shown in Figure 10-21. The MMU still implements page directories and page tables, but under PAE a third level, the page directory pointer table, exists above them. 物理地址扩展（PAE）从 Intel x86 Pentium Pro 处理器开始，引入了一种叫做“物理地址扩展”（PAE）的内存映射模式。搭配适当的芯片组，PAE 模式允许 32 位操作系统在当前的 Intel x86 处理器上，访问多达 64 GB 的物理内存（没有启用 PAE 则是 4GB）；或者运行在 x64 处理器的传统模式上，访问多达 1,024 GB 的物理内存（尽管由于页框号数据库需要描述的物理内存太多，目前 Windows 把这个大小限制在 64 GB，从而避免页框号数据库本身过于庞大）。（译注：请勿与 x64 处理器的原生寻址限制混淆——因为它不受 PAE 约束，而受可用地址总线位数的影响，详情请参考我翻译的第三部分“x64 Virtual Addressing Limitations”一节）当处理器运行在 PAE 模式，它的内存管理单元（MMU）把映射正常页面的虚拟地址划分为四个字段，如图 10-21 所示。 MMU 仍旧实现了页目录与页表，但在 PAE 模式下，这是一个三级的数据结构，“页目录指针表”（PDPT）位于最上层。 One way in which 32-bit applications can take advantage of such large memory configurations is described in the earlier section “Address Windowing Extensions.” However, even if applications are not using such functions, the memory manager will use all available physical memory for multiple processes’ working sets, file cache, and trimmed private data through the use of the system cache, standby, and modified lists (described in the section “Page Frame Number Database”). 在先前的章节“地址窗口扩展”讲过，它是 32 位应用程序能够利用如此大量内存配置的一种方式。然而，即使应用程序不使用这样的功能，内存管理器也会把所有可用的物理内存用在多进程工作集，文件缓存，以及通过使用系统缓存，备用和已修改列表来裁剪私有数据。。。等此类事务上。（在“页框号数据库”一节中，会有详细讨论） PAE mode is selected at boot time and cannot be changed without rebooting. As explained in Chapter 2 in Part 1, there is a special version of the 32-bit Windows kernel with support for PAE called Ntkrnlpa.exe. Thirty-two-bit systems that have hardware support for nonexecutable memory (described earlier, in the section “No Execute Page Protection”) are booted by default using this PAE kernel, because PAE mode is required to implement the no-execute feature. To force the loading of the PAE-enabled kernel, you can set the pae BCD option to ForceEnable. PAE 模式是在系统引导时刻就选择好的，需要重启才能禁用它。正如本书上册第二章所述，有一个特殊版本的 32 位 Windows 内核映像，它支持 PAE，即 Ntkrnlpa.exe 具有硬件支持“不可执行内存”的 32 位系统（在前面的“不可执行页保护”一节讲过），默认就使用 Ntkrnlpa.exe 这个 PAE 内核来启动，因为需要 PAE 模式来实现不可执行功能。要强制加载启用 PAE 的内核，你可以把 BCD 的 pae 选项设置为“ForceEnable”。（译注：作者比较懒，相关的知识点都引用前文，因此还是请参考我翻译的第一部分相关内容，以便把握全局） Note that the PAE kernel is installed on the disk on all 32-bit Windows systems, even systems with small memory and without hardware no-execute support. This is to allow testing of PAE-related code, even on small memory systems, and to avoid the need for reinstalling Windows should more RAM be added later. Another BCD option relevant to PAE is nolowmem, which discards memory below 4 GB (assuming you have at least 5 GB of physical memory) and relocates device drivers above this range. This guarantees that drivers will be presented with physical addresses greater than 32 bits, which makes any possible driver sign extension bugs easier to find. 请注意，PAE 内核会安装在所有 32 位 Windows 系统的磁盘上，即使那些只有很少内存，以及没有硬件不可执行支持的系统，也不例外。这是为了在小内存系统上，也能够测试 PAE 相关的代码，以及避免重装 Windows 后，需要添加更多 RAM 时（总量大于 4 GB），所带来的不便。另一个与 PAE 相关的 BCD 选项是“nolowmem”，它会丢弃低于 4 GB 的内存（假设你至少拥有 5 GB 的物理内存），并且将设备驱动程序重定位到此范围以上（ 4 GB 以上）。这就保证了驱动程序将以大于 32 位的物理地址来呈现，从而更易于发现任何可能存在的驱动程序签名扩展 bug。 To understand PAE, it is useful to understand the derivation of the sizes of the various structures and bit fields. Recall that the goal of PAE is to allow addressing of more than 4 GB of RAM. The 4-GB limit for RAM addresses without PAE comes from the 12-bit byte offset and the 20-bit page frame number fields of physical addresses: 12 + 20 = 32 bits of physical address, and 232 bytes = 4 GB. (Note that this is due to a limit of the physical address format and the number of bits allocated for the PFN within a page table entry. The fact that virtual addresses are 32 bits wide on x86, with or without PAE, does not limit the physical address space.) Under PAE, the PFN is expanded to 24 bits. Combined with the 12-bit byte offset, this allows addressing of 224 + 12 bytes, or 64 GB, of memory. 明白各种结构和位域大小的推导过程，对于理解 PAE 是很有帮助的。回顾 PAE 的目标，它是为了允许寻址超过 4 GB 的物理内存引入的。没有启用 PAE 时，随机访问存储器的 4 GB 地址限制，源自于物理地址中的 12 位字节偏移和 20 位页框号字段的长度：12 + 20 = 32 位物理地址，2 ^ 32 字节 = 4 GB。（请注意，这是由于物理地址格式，以及在页表项内为 PFN 分配的比特位数限制所致。虚拟地址在 x86 架构上是 32 位宽这一事实表明，物理地址空间不会受限于 PAE 的有无。）在 PAE 模式下，页框号扩展到了 24 位。它与 12 位的字节偏移结合后，就能够寻址 2 ^ (24 + 12) 字节，也就是 64 GB 的物理内存。 To provide the 24-bit PFN, PAE expands the PFN fields of page table and page directory entries from 20 to 24 bits. To allow room for this expansion, the page table and page directory entries are 8 bytes wide instead of 4. (This would seem to expand the PFN field of the PTE and PDE by 32 bits rather than just 4, but in x86 processors, PFNs are limited to 24 bits. This does leave a large number of bits in the PDE unused—or, rather, available for future expansion.) Since both page tables and page directories have to fit in one page, these tables can then have only 512 entries instead of 1,024. So the corresponding index fields of the virtual address are accordingly reduced from 10 to 9 bits. 为了能够提供 24 位的 PFN，PAE 把页表项和页目录项的 PFN 字段，从 20 位扩展成了 24 位。相应地，为了给此一扩展留出空间，页表项和页目录项的宽度现在变成了 8 字节，而不是 4 字节。（这看似能够将 PTE 和 PDE 中的 PFN 字段扩展 32 位，变成 54 位，以充分利用 8 字节的条目，而不仅仅是扩展 4 位，变成 36 位。但是在 x86 处理器中，PFN 被限制为 24 位。这确实在 PDE 中留下了大量未使用的比特——或者说，它们可用于将来的扩展）既然页表自身和页目录自身都需要按页对齐（放在一页中），这些表格就可以只有 512 个条目，而非原来的 1,024 个条目。（译注：因为每个条目现在是 8 字节，一张页内只能存放 512 个条目）所以，虚拟地址中相应的页目录索引和页表索引字段也就跟着从 10 位减少到了 9 位（2 ^ 9 = 512）（译注：请参考图 10-21，“页目录指针索引”字段占用 2 位，加上两个 9 位的子索引字段，一共 20 位，和非 PAE 模式下的宽度一致） This then leaves the two high-order bits of the virtual address unaccounted for. So PAE expands the number of page directories from one to four and adds a third-level address translation table, called the page directory pointer table, or PDPT. This table contains only four entries, 8 bytes each, which provide the PFNs of the four page directories. The two high-order bits of the virtual address are used to index into the PDPT and are called the page directory pointer index. 我们剩下虚拟地址中最高的 2 位还没解释。因此，PAE 把页目录的数量从一个扩展到了四个，并且增加了一个第三级的地址转换表，叫做页目录指针表，或简称 PDPT。它只包含四个条目，每个大小 8 字节，它们用来提供四个页目录的 PFN。现在你就能够推测到，虚拟地址中的两个最高位，用于索引 PDPT 内的条目，叫做页目录指针索引。 As before, CR3 provides the location of the top-level table, but that is now the PDPT rather than the page directory. The PDPT must be aligned on a 32-byte boundary and must furthermore reside in the first 4 GB of RAM (because CR3 on x86 is only a 32-bit register, even with PAE enabled). Note that PAE mode can address more memory than the standard translation mode not directly because of the extra level of translation, but because the physical address format has been expanded. The extra level of translation is required to allow processing of all 32 bits of a virtual address. CR3 依旧给出了顶级“表格”的位置，该“表格”现在是 PDPT，而非页目录。 PDPT 必须在 32 字节的边界上对齐，此外还需要驻留在首个 4 GB 的物理内存中。（这是由于，x86 的 CR3 寄存器只有 32 位，和 PAE 启用与否无关）请注意，PAE 模式能够寻址比标准翻译模式更多的内存，其直接原因并非是追加了额外翻译层的关系，而是由于物理地址的格式被扩展了。因此，需要额外的翻译层，才能够处理一个虚拟地址中的全部 32 位。 2017-6-9 17:44 0
shayi 雪币： 1604 活跃值： (640) 能力值： ( LV13，RANK：460 ) 在线值：发帖 23 回帖 269 粉丝 52 关注私信	shayi 9 14 楼 EXPERIMENT: Translating Addresses To clarify how address translation works, this experiment shows a real example of translating a virtual address on an x86 PAE system, using the available tools in the kernel debugger to examine the PDPT, page directories, page tables, and PTEs. (It is common for Windows on today’s x86 processors, even with less than 4 GB of RAM, to run in PAE mode because PAE mode is required to enable no-execute memory access protection.) In this example, we’ll work with a process that has virtual address 0x30004, currently mapped to a valid physical address. In later examples, you’ll see how to follow address translation for invalid addresses with the kernel debugger. First let’s convert 0x30004 to binary and break it into the three fields that are used to translate an address. In binary, 0x30004 is 11.0000.0000.0000.0100. Breaking it into the component fields yields the following: 实验：地址转译本实验会演示在一个 x86 PAE 系统上转译虚拟地址的真实案例，以便把地址转译的原理说清楚讲明白，使用内核调试器中可用的工具（命令），来审查 PDPT，页目录，页表，以及页表项。（如前所述——作者比我还罗嗦——在当今启用了 PAE 模式的 x86 处理器上运行 Windows 是很常见的，甚至少于 4 GB 物理内存的系统也运行在 PAE 模式下，因为需要 PAE 模式才能够启用不可执行内存的访问保护。）在这个例子中，我们将要剖析的进程虚拟地址为 0x30004，当前被映射到一个有效的物理地址。而在稍后的例子中，你将看到如何通过内核调试器来为无效地址跟踪它的转换过程。 1。首先，我们把 0x30004 转换为二进制表示，再将其分解成用于翻译地址的三个字段。0x30004 的二进制表示为“11.0000.0000.0000.0100”。分解后的组成字段如下图所示： To start the translation process, the CPU needs the physical address of the process’s page directory pointer table, found in the CR3 register while a thread in that process is running. You can display this address by looking at the DirBase field in the output of the !process command, as shown here: 为了启动转换过程，CPU 需要该进程的页目录指针表的物理地址，这个进程内的一个线程在运行时，可以从 CR3 寄存器中找到该地址。你可以在“!process”命令的输出中，查看“DirBase”字段，它显示出页目录指针表的物理地址，如下所示： lkd> !process -1 0 PROCESS 852d1030 SessionId: 1 Cid: 0dec Peb: 7ffdf000 ParentCid: 05e8 DirBase: ced25440 ObjectTable: a2014a08 HandleCount: 221. Image: windbg.exe The DirBase field shows that the page directory pointer table is at physical address 0xced25440. As shown in the preceding illustration, the page directory pointer table index field in our example virtual address is 0. Therefore, the PDPT entry that contains the physical address of the relevant page directory is the first entry in the PDPT, at physical address 0xced25440. As under x86 non-PAE systems, the kernel debugger !pte command displays the PDE and PTE that describe a virtual address, as shown here: 2。上面输出中的 DirBase 字段显示出页目录指针表位于物理地址 0xced25440。如前一张插图所示，虚拟地址中，页目录指针表索引字段的值为 0。因此，含有相关页目录物理地址的 PDPT 项（PDPTE）就是该 PDPT 中的第一个条目，位于物理地址 0xced25440。与在 x86 非 PAE 系统上一样，内核调试器的“!pte”命令能够显示描述虚拟地址的 PDE 和 PTE，如下所示： lkd> !pte 30004 VA 00030004 PDE at C0600000 PTE at C0000180 contains 000000002EBF3867 contains 800000005AF4D025 pfn 2ebf3 ---DA--UWEV pfn 5af4d ----A--UR-V The debugger does not show the page directory pointer table, but it is easy to display given its physical address: 虽然上面输出中没有显示出页目录指针表的内容，但是通过调试器扩展命令“!dq”，后接上“表格”的物理地址，就能够对其进行查看： lkd> !dq ced25440 L 4 #ced25440 00000000`2e8ff801 00000000`2c9d8801 #ced25450 00000000`2e6b1801 00000000`2e73a801 Here we have used the debugger extension command !dq. This is similar to the dq command (display as quadwords—“quadwords” being a name for a 64-bit field; this came from the day when “words” were often 16 bits), but it lets us examine memory by physical rather than virtual address. Since we know that the PDPT is only four entries long, we added the L 4 length argument to keep the output uncluttered. “!dq”命令与“dq”命令类似（按照“四字”的格式输出——“四字”已演变为 64 位字段的一个别名；这可以追溯到当“字”成为一个 16 位值同义词的那天起），但是“!dq”命令允许我们查看物理内存地址，而不是虚拟内存地址。既然我们已知 PDPT 内只有四个条目，就可以在命令后添加“L 4”长度参数，让输出保持整洁易读。 As illustrated previously, the PDPT index (the two most significant bits) from our example virtual address equal 0, so the PDPT entry we want is the first displayed quadword. PDPT entries have a format similar to PD entries and PT entries, so we can see by inspection that this one contains a PFN of 0x2e8ff, for a physical address of 2e8ff000. That’s the physical address of the page directory. 3。如前面那张插图所示，虚拟地址中的 PDPT 索引（两个最高有效位）等于 0，所以第一个显示的“四字”就是我们要找的 PDPTE。 PDPTE 的格式与 PDE 和 PTE 很相似，因此检查后我们知道，首个 PDPTE 的 PFN 为 0x2e8ff，它描述的物理地址从 0x2e8ff000 开始——亦即该页目录的物理地址。（0x801 是所在物理页面的状态标志） The !pte output shows the PDE address as a virtual address, not physical. On x86 systems with PAE, the first process page directory starts at virtual address 0xC0600000. The page directory index field of our example virtual address is 0, so we’re looking at the first PDE in the page directory. Therefore, in this case, the PDE address is the same as the page directory address. 4。前面“!pte”命令输出 PDE 的虚拟地址，而非物理地址。在开启 PAE 的 x86 系统上，该进程的第一个页目录从虚拟地址 0xC0600000 开始。如前面那张插图所示，虚拟地址中的页目录索引字段为 0，因此我们需要寻找“第一个页目录中的第一个 PDE”。既然这样，首个 PDE 的地址就与首个页目录地址相同。 As with non-PAE, the page directory entry provides the PFN of the needed page table; in this example, the PFN is 0x2ebf3. So the page table starts at physical address 0x2ebf3000. To this the MMU will add the page table index field (0x30) from the virtual address, multiplied by 8 (the size of a PTE in bytes; this would be 4 on a non-PAE system). The resulting physical address of the PTE is then 0x2ebf3180. 5。与非 PAE 下的情况相同，PDE 提供了所需页表的 PFN；在本例中是 0x2ebf3。因此，对应的页表从物理地址 0x2ebf3000 处开始。 6。接下来，MMU 会把（如前面那张插图所示）虚拟地址中的页表索引字段值（0x30）乘以 8（PTE 的大小，以字节为单位，在非 PAE 系统上是 4 字节），得出 0x180，因此“第一个页目录中，第一张页表中，第 0x30 号 PTE 的物理地址为 0x2ebf3180”。 The debugger shows that this PTE is at virtual address 0xC0000180. Notice that the byte offset portion (0x180) is the same as that from the physical address, as is always the case in address translation. Because the memory manager maps page tables starting at 0xC0000000, adding 0x180 to 0xC0000000 yields the virtual address shown in the kernel debugger output: 0xC0000180. The debugger shows that the PFN field of the PTE is 0x5af4d. 前面调试器的“!pte”命令输出显示，第 0x30 号 PTE 被映射到虚拟地址 0xC0000180。注意，该虚拟地址的字节偏移部分（0x180）与物理地址（0x2ebf3180）中的一致，这是地址翻译秉持的一贯做法。因为内存管理器把“第一个页目录中的第一张页表”映射到从虚拟地址 0xC0000000 开始，把它加上字节偏移 0x180，就产出了内核调试器输出中的虚拟地址 0xC0000180。 7。根据前面调试器的“!pte”命令输出得知，第 0x30 号 PTE 描述的物理页的 PFN 为 0x5af4d。（该页面从 0x5af4d000 开始） Finally, we can consider the byte offset from the original address. As described previously, the MMU will concatenate the byte offset to the PFN from the PTE, giving a physical address of 0x5af4d004. This is the physical address that corresponds to the original virtual address of 0x30004—at the moment. 8。最后，我们就可以来考虑原始地址（如前面那张插图所示）中的字节偏移。如前所述，MMU 会把字节偏移串联在 PTE 中的 PFN 字段之后，从而给出物理地址 0x5af4d004。这就是“眼前”对应于原始虚拟地址 0x30004 的物理地址。 The flags bits from the PTE are interpreted to the right of the PFN number. For example, the PTE that describes the page being referenced has flags of --A--UR-V. Here, A stands for accessed (the page has been read), U for user-mode accessible (as opposed to kernel-mode accessible only), R for read-only page (rather than writable), and V for valid (the PTE represents a valid page in physical memory). PFN 号的右侧被解释成该 PTE 中的标志位（译注：即 0x025）。例如，描述被引用页面的 PTE 标志为 --A--UR-V。此处，A 代表已访问（该页已被读取过），U 代表用户模式可访问（而非仅在内核模式下可访问），R 代表只读页（而非可写），V 代表有效（即，该 PTE 表示物理内存中的一个有效页） To confirm our calculation of the physical address, we can look at the memory in question via both its virtual and its physical addresses. First, using the debugger’s dd command (display dwords) on the virtual address, we see the following: 为了证实我们的“手动”物理地址计算结果，我们可以同时检查该虚拟地址和物理地址处的相关内存内容，如果一致，表明翻译结果是正确的。首先，对该虚拟地址使用调试器的“dd”命令（显示“双字”），如下所示： lkd> dd 30004 00030004 00000020 00000001 00003020 000000dc 00030014 00000000 00000020 00000000 00000014 00030024 00000001 00000007 00000034 0000017c 00030034 00000001 00000000 00000000 00000000 00030044 00000000 00000000 00000002 1a26ef4e 00030054 00000298 00000044 000002e0 00000260 00030064 00000000 f33271ba 00000540 0000004a 00030074 0000058c 0000031e 00000000 2d59495b And with the !dd command on the physical address just computed, we see the same contents: 接着对我们刚才计算出的物理地址使用“!dd”命令，得出下面内容： lkd> !dd 5af4d004 #5af4d004 00000020 00000001 00003020 000000dc #5af4d014 00000000 00000020 00000000 00000014 #5af4d024 00000001 00000007 00000034 0000017c #5af4d034 00000001 00000000 00000000 00000000 #5af4d044 00000000 00000000 00000002 1a26ef4e #5af4d054 00000298 00000044 000002e0 00000260 #5af4d064 00000000 f33271ba 00000540 0000004a #5af4d074 0000058c 0000031e 00000000 2d59495b 从这两个地址开始后的一段内存区域都存储着相同的内容，表明两者间的映射关系正确。 We could similarly compare the displays from the virtual and physical addresses of the PTE and PDE. 我们也可以分别把 PTE 和 PDE 的虚拟地址与物理地址内容输出，作类似的比较。（译注：以 PDE 为例子，就是对比物理地址 0x2e8ff000 处的内容，是否与虚拟地址 0xC0600000 处的内容一致） 2017-6-9 18:03 0
yangya 雪币： 58 活跃值： (1130) 能力值： ( LV2，RANK：10 ) 在线值：发帖 5 回帖 318 粉丝 1 关注私信	yangya 15 楼还没完？ 2017-11-8 23:06 0
cqzhou 雪币： 4522 活跃值： (2146) 能力值： ( LV4，RANK：50 ) 在线值：发帖 3 回帖 219 粉丝 1 关注私信	cqzhou 16 楼英语渣路过 2017-11-9 08:44 0
ntDownload 雪币： 135 活跃值： (10) 能力值： ( LV2，RANK：10 ) 在线值：发帖 7 回帖 255 粉丝 0 关注私信	ntDownload 17 楼支持！ 2017-11-9 09:33 0
hpsales 雪币： 2 活跃值： (10) 能力值： ( LV2，RANK：10 ) 在线值：发帖 0 回帖 3 粉丝 0 关注私信	hpsales 18 楼这样最好，双语一起的 2017-11-18 09:54 0
shayi 雪币： 1604 活跃值： (640) 能力值： ( LV13，RANK：460 ) 在线值：发帖 23 回帖 269 粉丝 52 关注私信	shayi 9 19 楼 yangya 还没完？您好，最近因为工作关系加之年假期间可能无法按时按量翻译，以原文的分量来看，可能需要开到第七部分才能把内存管理章节翻译完 2018-2-14 15:04 0
挽梦雪舞雪币： 26136 活跃值： (1409) 能力值： ( LV4，RANK：50 ) 在线值：发帖 22 回帖 432 粉丝 22 关注私信	挽梦雪舞 20 楼引用失败，顶楼！最后于 2018-3-1 03:01 被挽梦雪舞编辑，原因： 2018-3-1 02:59 0
挽梦雪舞雪币： 26136 活跃值： (1409) 能力值： ( LV4，RANK：50 ) 在线值：发帖 22 回帖 432 粉丝 22 关注私信	挽梦雪舞 21 楼 shayi 您好，最近因为工作关系加之年假期间可能无法按时按量翻译，以原文的分量来看，可能需要开到第七部分才能把内存管理章节翻译完看了你的双语翻译，我已经对此有所了解，同时关于win操作系统我也有过一些了解，看过一些英文原著，不知可否将后面的部分未翻译文章发给我，让我尝试翻译一番？！还有一个问题就是为什么不直接在原文进行编辑补充呢？最后于 2018-3-1 03:04 被挽梦雪舞编辑，原因： 2018-3-1 03:02 0
Ox9A82 雪币： 799 活跃值： (457) 能力值： ( LV12，RANK：280 ) 在线值：发帖 18 回帖 116 粉丝 28 关注私信	Ox9A82 3 22 楼功德无量 2018-3-13 00:51 0
玉涵雪币： 5697 活跃值： (984) 能力值： ( LV9，RANK：400 ) 在线值：发帖 24 回帖 37 粉丝 93 关注私信	玉涵 5 23 楼这本书的下册翻译版不清楚为啥夭折了，下册的含金量可以说是非常之高，感谢译者，呕心沥血。 2018-3-15 20:49 0
geoh 雪币： 222 活跃值： (15) 能力值： ( LV2，RANK：10 ) 在线值：发帖 0 回帖 16 粉丝 0 关注私信	geoh 24 楼打包成PDF，方便大侠们下载最后于 2018-4-9 13:50 被geoh编辑，原因：上传的附件：深入解析windows操作系统第6版下册第10章-内存管理第四部分.pdf （1.54MB，141次下载） 2018-4-9 13:50 0
shayi 雪币： 1604 活跃值： (640) 能力值： ( LV13，RANK：460 ) 在线值：发帖 23 回帖 269 粉丝 52 关注私信	shayi 9 25 楼 geoh 打包成PDF，方便大侠们下载感谢您的帮忙 2018-4-12 20:35 0
	游客登录 \| 注册方可回帖回帖表情雪币赚取及消费高级回复

shayi

雪币： 1604

活跃值： (640)

能力值：

( LV13，RANK：460 )

在线值：

发帖

23

回帖

269

粉丝

52

关注

私信

shayi 9: 2 楼

The sizes of these bit fields are dictated by the structures they reference. For example, the byte
offset is 12 bits because it denotes a byte within a page, and pages are 4,096 bytes (2 ^ 12 = 4,096). The
other indexes are 10 bits because the structures they index have 1,024 entries (2 ^ 10 = 1,024).
这些位域的大小取决于它们所引用的结构。比如，“字节偏移”域是 12 位，因为该字段需要指示出页面内的一个字节，而页大小
一般为 4,096 字节（2 ^ 12 = 4,096）。（译注：按作者意思，12 位才可能够涵盖一个页中所有的字节）
其它的索引字段则是 10 位，因为它们索引的结构中，有 1,024 个条目（2 ^ 10 = 1,024）
（译注：即，一个 PD[页目录] 中，有 1,024 个 PDE[页目录项]，每个 PDE 索引一个 PT[页表]；
一个 PT 中，有 1,024 个 PTE[页表项]，每个 PTE 索引一个范围 4,096 字节的页面）

The job of virtual address translation is to convert these virtual addresses into physical addresses—
that is, addresses of locations in RAM. The format of a physical address on an x86 non-PAE system is
shown in Figure 10-17.
虚拟地址翻译，这项工作就是把虚拟地址转换成物理地址——亦即，RAM（随机访问存储器）中位置的地址。在非 PAE 的 x86 系统上，物理地址的格式如图 10-17 所示。

As you can see, the format is very similar to that of a virtual address. Furthermore, the byte offset
value from a virtual address will be the same in the resulting physical address. We can say, then, that
address translation involves converting virtual page numbers to physical page numbers (also referred
to as page frame numbers, or PFNs). The byte offset does not participate in, and does not change as a
result of, address translation. It is simply copied from the virtual address to the physical address,
如你所见，它的格式与虚拟地址非常相似。此外，转换产生的物理地址中的“字节偏移”值，将与原虚拟地址中的相同。
如此一来，我们可以说，地址翻译涉及把虚拟页号转换成物理页号（亦称为“页框号”，或 PFN）。事实上，“字节偏移”并不参与转换过程，也不因地址转换的结果而改变——只是简单地把它从虚拟地址复制到物理地址中。

Figure 10-18 shows the relationship of these three values and how they are used to perform address
translation.
图 10-18 描绘了这三个值（字段）之间的关系，以及它们如何被用来执行地址翻译。

The following basic steps are involved in translating a virtual address:
1. The memory management unit (MMU) uses a privileged CPU register, CR3, to obtain the
physical address of the page directory.
2. The page directory index portion of the virtual address is used as an index into the page
directory.
This locates the page directory entry (PDE) that contains the location of the page table needed to map the virtual
address.
The PDE in turn contains the physical page number, also called the page frame number, or PFN, of the desired page table,
provided the page table is resident—page tables can be paged out or not yet created, and in those cases, the page
table is first made resident before proceeding.
If a flag in the PDE indicates that it describes a large page, then it simply contains the PFN of the target large page,
and the rest of the virtual address is treated as the byte offset within the large page.　
3. The page table index is used as an index into the page table to locate the PTE that describes
the virtual page in question.
4. If the PTE’s valid bit is clear, this triggers a page fault (memory management fault). The operating
system’s memory management fault handler (pager) locates the page and tries to make
it valid; after doing so, this sequence continues at step 5. (See the section “Page Fault Handling.”)
If the page cannot or should not be made valid (for example, because of a protection
fault), the fault handler generates an access violation or a bug check.
5. When the PTE describes a valid page (whether immediately or after page fault resolution), the
desired physical address is constructed from the PFN field of the PTE, followed by the byte
offset field from the original virtual address.
Now that you have the overall picture, let’s look at the detailed structure of page directories, page
tables, and PTEs.

翻译一个虚拟地址涉及到以下基本步骤：
1。内存管理单元使用一个叫做 CR3 的特权 CPU 寄存器，来获取页目录的物理地址。
2。虚拟地址中的“页目录索引”部分被用于索引页目录中的“项”（PDE）。这就能够找到特定的 PDE ，它含有映射虚拟地址必需
的页表位置。换言之，PDE 中包含：所期望页表的物理页号，或称页框号，或 PFN——在该页表驻留物理内存的条件下——
页表可以被换出物理内存，或根本尚未创建，在后两种情况中，首先要让该页表驻留在物理内存中，才能进行后续处理。
如果 PDE 中的一个标志指出它描述的是一个大页，那它就包含目标大页的 PFN，而虚拟地址的其余部分被视为该大页内的字节偏移。
3。“页表索引”被用于在页表中，找出负责描述相关虚拟页的 PTE。（译注：如图所示，PTE 包含所期望“页面”的 PFN）
4。如果该 PTE 的有效位被清除，这将触发一次页错误（内存管理故障）。操作系统的内存管理故障处理程序（换页器）会找到
该页面，并试图让它有效（请参考“页错误处理”一节）；在此之后，继续步骤 5 中的操作。
如果无法或者不应该将此页面设置为有效（比如，由于页保护导致的错误），错误处理程序生成一个非法访问异常，或者 bug check
蓝屏。（译注：即，崩溃在内核模式下）
5。当该 PTE 能够描述一个有效的页面时（无论在当下，还是在分页异常得到解决后），就从该 PTE 的 PFN 字段构建出期望的物理地址，
后面紧跟着原始虚拟地址中的“字节偏移”字段。
现在你已经掌握了地址翻译的大局，让我们再来研究页目录，页表，以及页表项的细部结构。

2017-6-3 17:35

0

shayi

雪币： 1604

活跃值： (640)

能力值：

( LV13，RANK：460 )

在线值：

发帖

23

回帖

269

粉丝

52

关注

私信

shayi 9: 3 楼

Page Directories

On non-PAE x86 systems, each process has a single page directory, a page the memory manager creates
to map the location of all page tables for that process.
The physical address of the process page
directory is stored in the kernel process (KPROCESS) block, but it is also mapped virtually at address
0xC0300000 on x86 non-PAE systems. (For more detailed information about the KPROCESS and other
process data structures, refer to Chapter 5, “Processes, Threads, and Jobs” in Part 1.)

页目录

在非 PAE 的 x86 系统上，每个进程都有单一的页目录，这是内存管理器创建来映射该进程所有页表位置的页面。
进程页目录的物理地址存储在“内核进程块”（KPROCESS）中，但在 x86 非 PAE 系统上，它也虚拟地映射到地址 0xC0300000 处。
（更多有关 KPROCESS 和其它进程数据结构的信息，请参考本书上册第五章“Processes, Threads, and Jobs”）

The CPU obtains the location of the page directory from a privileged CPU register called CR3.
It contains the page frame number of the page directory. (Since the page directory is itself always
page-aligned, the low-order 12 bits of its address are always zero, so there is no need for CR3 to supply
these.)
Each time a context switch occurs to a thread that is in a different process than that of the
currently executing thread, the context switch routine in the kernel loads this register from a field in
the KPROCESS block of the new process. Context switches between threads in the same process don’t
result in reloading the physical address of the page directory because all threads within the same
process share the same process address space and thus use the same page directory and page tables.

CPU 从 CR3 寄存器获取页目录的位置。CR3 包含页目录的页框号。（由于页目录自身总是按页对齐的，页目录地址中的低 12 位总是
为零，因此 CR3 无需提供“字节偏移”信息。）
每次发生上下文切换到一个与当前执行线程所属进程不同的线程时，内核中的环境切换例程从新进程 KPROCESS 块中的一个字段
（译注：即 DirectoryTableBase 字段）载入到 CR3 寄存器。同一个进程内的线程间上下文切换，并不会导致重载页目录的物理地址，
因为相同进程内的所有线程共享同一个进程地址空间，因而使用相同的页目录和页表。

The page directory is composed of page directory entries (PDEs), each of which is 4 bytes long.
The PDEs in the page directory describe the state and location of all the possible page tables for the
process. As described later in the chapter, page tables are created on demand, so the page directory
for most processes points only to a small set of page tables.
(If a page table does not yet exist, the VAD tree is consulted to determine whether an access should materialize it.)
The format of a PDE isn’t repeated here because it’s mostly the same as a hardware PTE, which is described shortly.
页目录由页目录项（PDE）组成，每个 PDE 大小为四字节。页目录中的 PDE 描述所有可能用于该进程页表的状态与位置。
正如本章稍后所述，页表是按需创建的，因此多数进程的页目录仅指向很小的一组页表。
（假设一张页表尚不存在，就会查阅 VAD 树，以确定是否应该实现对它的访问——译者提问：即创建页表？）
此处不会重复介绍 PDE 的格式，因为它与硬件 PTE 大致相同，稍后会对其进行讨论。

To describe the full 4-GB virtual address space, 1,024 page tables are required. The process page
directory that maps these page tables contains 1,024 PDEs. Therefore, the page directory index needs
to be 10 bits wide (2 ^ 10 = 1,024).
为了能够描述整个 4GB 的虚拟地址空间，需要 1,024 张页表。所以，映射这些页表的进程页目录就包含 1,024 个 PDE（译注：
每个 PDE 描述一张页表）。因此，页目录索引的宽度必须是 10 位（2 ^ 10 = 1,024）。

EXPERIMENT: Examining the Page Directory and PDEs
You can see the physical address of the currently running process’s page directory by examining
the DirBase field in the !process kernel debugger output:

实验：查看页目录和 PDE
你可以通过查看内核调试器的“!process”命令输出中的“DirBase”字段，来确认当前运行进程的页目录物理地址：

lkd> !process -1 0
PROCESS 857b3528	SessionId: 1	Cid: 0f70		Peb: 7ffdf000	ParentCid: 0818
	DirBase: 47c9b000	ObjectTable: b4c56c48	HandleCount: 226.
	Image: windbg.exe

You can see the page directory’s virtual address by examining the kernel debugger output
for the PTE of a particular virtual address, as shown here:

你可以通过考察内核调试器为特定虚拟地址的 PTE 输出（执行“!pte”命令），来确认页目录的虚拟地址，如下所示：

lkd> !pte 10004
			VA 00010004
PDE at C0300000		PTE at C0000040
contains 6F06B867		contains 3EF8C847
pfn 6f06b ---DA--UWEV	pfn 3ef8c ---D---UWEV

（译注：页目录中第一个“项”的虚拟地址为 0xC0300000，验证了前文的内容；而 PTE 的虚拟地址为 0xC0000040——这两个
虚拟地址用来解析在命令中指定的虚拟地址 0x00010004；输出中的第三行是 4 字节的 PDE 与 PTE 内容——你可以看到，前面
的 20 位就是它们要描述的对象——页表或页面——的页框号；后 12 位则是目标页表或页面的状态标志，例如，该 PTE 的内容为
0x3EF8C847，这表明它描述的页面在物理内存中的页框号为 0x3EF8C000，后 12 位 0x00000847 的二进制形式被解释为“V”置位，表示该
页面有效；“E”置位，表示该页面包含可执行代码；“W”置位，表示该页面可写；“U”置位，表示该页面属于用户模式；“D”置位，
表示该页面是“脏页”）

The PTE part of the kernel debugger output is defined in the section “Page Tables and Page
Table Entries.” We will describe this output further in the section on x86 PAE translation.

上面内核调试器输出中的 PTE 部分，在“页表和页表条目”一节中解释。我们将在“x86 上的 PAE 地址转换”一节中，
进一步讨论这个输出的含义。
（译注：我又来罗嗦几句了——既然原书作者喜欢卖关子，那我就要把前面讲过的一些知识点联系在一起，并且通过实战来总结，如下图所示
，调试机上运行 Windbg.exe ，通过命名管道模拟连接至被调试虚拟机上的 COM 口，后者以调试模式启动，整个命令带上参数可以是这样：

Windbg.exe -n -v -logo d:\kernel_realtime_debugging.txt -k com:pipe,port=\\.\pipe\com_1,baud=115200,reconnect

其中，“-logo”选项后接用来记录调试日志的文件路径。

下图显示出，被调试机上当前运行着 System 进程，它的页目录物理基地址是 0x00185000，这与 CR3 的当前值一致；
System 进程的“执行体进程块”——EPROCESS 结构——的虚拟地址为 0x85534408 ；由于 EPROCESS 结构中的第一个字段“Pcb”就是
一个内嵌的 KPROCESS 结构，因此它的地址就是内核进程块的地址，这就是为啥用 dt 命令转储内核进程块的 DirectoryTableBase 字段值时，
可以接上 EPROCESS 结构起始虚拟地址的原因。这个字段也指向 0x00185000，上下文切换就涉及把目标进程的 DirectoryTableBase 值
装载到 CR3 中）

2017-6-3 17:44

0

shayi

雪币： 1604

活跃值： (640)

能力值：

( LV13，RANK：460 )

在线值：

发帖

23

回帖

269

粉丝

52

关注

私信

shayi 9: 4 楼

Because Windows provides a private address space for each process, each process has its own

page directory and page tables to map that process’s private address space.
However, the page tables
that describe system space are shared among all processes (and session space is shared only among
processes in a session).
To avoid having multiple page tables describing the same virtual memory,
when a process is created, the page directory entries that describe system space are initialized to
point to the existing system page tables.
If the process is part of a session, session space page tables
are also shared by pointing the session space page directory entries to the existing session page
tables.
由于 Windows 为每个进程提供了各自私有的地址空间，每个进程就有自己的页目录和页表来映射自身的私有地址空间。

然而，描述系统空间的页表在所有进程间共享（而“会话空间”仅在属于该会话的进程间共享）。
为了避免有多个页表描述同一个虚拟内存区域，创建进程时，描述系统空间的页目录项被初始化为指向现有的系统页表。
如果进程是会话的一部分，也通过把会话空间页目录项指向现有的会话页表，来共享它们。

Page Tables and Page Table Entries

Each page directory entry points to a page table. A page table is a simple array of PTEs. The virtual
address’s page table index field (as shown in Figure 10-18) indicates which PTE within the page table
corresponds to and describes the data page in question.
The page table index is 10 bits wide, allowing you to reference up to 1,024 4-byte PTEs.

Of course, because x86 provides a 4-GB virtual address
space, more than one page table is needed to map the entire address space.
To calculate the number of page tables required to map the entire 4-GB virtual address space, divide 4 GB by the
virtual memory mapped by a single page table.
Recall that each page table on an x86 system maps 4 MB of
data pages. Thus, 1,024 page tables (4 GB / 4 MB) are required to map the full 4-GB address space.
This corresponds with the 1,024 entries in the page directory.

页表与页表项

每个页目录项都指向一张页表。页表就是一个由 PTE 构成的数组。虚拟地址中的“页表索引”字段（请回顾图 10-18）指出它对应于
页表内的哪一个 PTE，而后者描述了与其相关的数据页。“页表索引”的宽度是 10 位，让它最多能够引用 1,024 个四字节长度的 PTE。
当然，由于 x86 提供了一个 4GB 的虚拟地址空间，这就需要不止一张页表，才能够映射完整个地址空间。
为了计算映射整个 4GB 虚拟地址空间所需的页表数量，可以把 4GB 除以由单张页表映射的虚拟内存范围。
回想一下：x86 系统上的每张页表映射 4 MB 范围的数据页。因此，需要 1,024 张页表（4 GB / 4 MB）来映射整个 4GB 地址空间。
这就对应了页目录中的 1,024 个“项”（即 PDE）。

You can use the !pte command in the kernel debugger to examine PTEs. (See the experiment
“Translating Addresses.”) We’ll discuss valid PTEs here and invalid PTEs in a later section. Valid PTEs
have two main fields: the page frame number (PFN) of the physical page containing the data or of the
physical address of a page in memory, and some flags that describe the state and protection of the
page, as shown in Figure 10-19.
你可以使用内核调试器的“!pte”命令查看 PTE。（参考实验：“地址翻译”）
此处我们将讨论有效的 PTE，在后续部分讨论无效的 PTE。有效 PTE 的两个主要字段，即物理页（包含数据）的页框号（PFN），
或 RAM 中页面物理地址的页框号，另一个字段则是描述该页面状态和保护信息的一些标志位，如图 10-19 所示。

As you’ll see later, the bits labeled “Software field” and “Reserved” in Figure 10-19 are ignored by
the MMU, whether or not the PTE is valid. These bits are stored and interpreted by the memory manager.
Table 10-11 briefly describes the hardware-defined bits in a valid PTE.
正如您稍候将要看到的，CPU 的内存管理单元（MMU）会忽略掉图 10-19 中，被标记为“Software field”和 “Reserved”的位，
无论这个 PTE 是否有效。图 10-19 中的这些位由内存管理器存储并解释。表 10-11 将一个有效 PTE 中，那些硬件定义的位作了简短
介绍。

On x86 systems, a hardware PTE contains two bits that can be changed by the MMU, the Dirty bit
and the Accessed bit. The MMU sets the Accessed bit whenever the page is read or written (provided
it is not already set).
The MMU sets the Dirty bit whenever a write operation occurs to the page.
The
operating system is responsible for clearing these bits at the appropriate times; they are never cleared
by the MMU.
在 x86 系统上，硬件 PTE 包含两个可由 MMU 更改的比特位，即“脏位”和“访问位”。每当该页面被读取或写入时，MMU 就会设置
访问位（只要它未被设置）。
每当该页面发生一次写操作时，MMU 就会设置脏位。操作系统负责在适当的时间清除这些位；MMU 从不清除它们。

The x86 MMU uses a Write bit to provide page protection. When this bit is clear, the page is readonly;
when it is set, the page is read/write.
If a thread attempts to write to a page with the Write bit
clear, a memory management exception occurs, and the memory manager’s access fault handler (described
later in the chapter) must determine whether the thread can be allowed to write to the page
(for example, if the page was really marked copy-on-write) or whether an access violation should be
generated.
x86 MMU 借助“Write bit”来提供页面保护机制。当此位清 0 时，对应的页面是只读的；当此位置 1 时，对应的页面可读写。
如果一个线程试图向“Write bit”清 0 的页面写入，就会引发一次内存管理异常，而内存管理器的“访问错误处理程序”（
本章稍后会讲到）必须确定：该线程是否被允许写入这个页面（比如，目标页面确实标记了“写时复制”），或者是否应该产生
一个“非法访问”异常。

2017-6-3 17:50

0

shayi

雪币： 1604

活跃值： (640)

能力值：

( LV13，RANK：460 )

在线值：

发帖

23

回帖

269

粉丝

52

关注

私信

shayi 9: 5 楼

Hardware vs. Software Write Bits in Page Table Entries

The additional Write bit implemented in software (as mentioned in Table 10-11) is used to force
updating of the Dirty bit to be synchronized with updates to Windows memory management data.
In a simple implementation, the memory manager would set the hardware Write bit (bit 1) for any
writable page, and a write to any such page will cause the MMU to set the Dirty bit in the page table
entry. Later, the Dirty bit will tell the memory manager that the contents of that physical page must
be written to backing store before the physical page can be used for something else.

PTE 中的硬写位（Hardware Write Bits）与软写位（Software Write Bits）

在软件中实现的附加“Write bit”（如表 10-11 所示）用来强制更新“脏位”，使它能与 Windows 内存管理器数据的更新同步。
在一个简单的实现逻辑中，内存管理器将会给任何可写页面设置硬写位（PTE 中的位 1），而向任何此类页面的写入将导致 MMU 设置
此 PTE 中的“脏位”。
随后，置 1 的“脏位”将告知内存管理器：必须把该物理页面的内容写入后备存储（在此页面可用于记录其它数据之前）。

In practice, on multiprocessor systems, this can lead to race conditions that are expensive to
resolve. The MMUs of the various processors can, at any time, set the Dirty bit of any PTE that has its
hardware Write bit set. The memory manager must, at various times, update the process working set
list to reflect the state of the Dirty bit in a PTE. The memory manager uses a pushlock to synchronize
access to the working set list. But on a multiprocessor system, even while one processor is holding the
lock, the Dirty bit might be changed by MMUs of other CPUs. This raises the possibility of missing an
update to a Dirty bit.
在实践中，对于多处理系统，这可能导致竞争条件，解决它的代价很昂贵。
各处理器的 MMU 们，都可以在任何时候设置任何 PTE 的“脏位”（它们的硬写位已置 1）。相应地，内存管理器就必须在不同时间
更新进程的工作集列表，以反映该 PTE 中“脏位”的状态。
（译注：有效 PTE 引用的进程页面驻留在物理内存中，它们合称工作集。所以要把对某个 PTE 的修改，更新到它引用的进程页面——
这就是更新该进程的工作集）
内存管理器使用“推锁”来同步对工作集列表的访问。然而在多处理器系统上，即便当一个处理器持有该锁，其它 CPU 的 MMU 们，
也可以修改同一个 PTE 的“脏位”。这就可能会错过对“脏位”的相应更新。

To avoid this, the Windows memory manager initializes both read-only and writable pages with
the hardware Write bit (bit 1) of their PTEs set to 0 and records the true writable state of the page
in the software Write bit (bit 11).
On the first write access to such a page, the processor will raise a
memory management exception because the hardware Write bit is clear, just as it would be for a true
read-only page.
In this case, though, the memory manager learns that the page actually is writable
(via the software Write bit), acquires the working set pushlock, sets the Dirty bit and the hardware
Write bit in the PTE, updates the working set list to note that the page has been changed, releases the
working set pushlock, and dismisses the exception.
The hardware write operation then proceeds as
usual, but the setting of the Dirty bit is made to happen with the working set list pushlock held.
为了避免这种情况，Windows 内存管理器就把描述只读和可写页的那些 PTE 中的硬写位（位 1），都初始化成 0，然后在软写位（
位 11）中，记录该页面的实际可写状态。
首次向此类页面执行写访问时，处理器将会引发一次内存管理异常，因为硬写位是清除的，仿佛一个真正的只读页那样。
然而在这种情况下，内存管理器通过软写位获悉该页面实际上是可写的，它就会取得工作集推锁，设置 PTE 中的“脏位”和硬写位，
更新工作集列表，以通知该页面的变动，释放工作集推锁，并且驳回异常。
接着一如既往地进行硬件写操作，而在持有工作集锁列表推锁后，就会发生对“脏位”的设置。

On subsequent writes to the page, no exceptions occur because the hardware Write bit is set.
The MMU will redundantly set the Dirty bit, but this is benign because the “written-to” state of the page is
already recorded in the working set list.
Forcing the first write to a page to go through this exception handling may seem to be excessive overhead.
However, it happens only once per writable page as long as the page remains valid.
Furthermore, the first access to almost any page already goes through
memory management exception handling because pages are usually initialized in the invalid state
(PTE bit 0 is clear).
If the first access to a page is also the first write access to the page, the Dirty bit
handling just described will occur within the handling of the first-access page fault, so the additional
overhead is small.
Finally, on both uniprocessor and multiprocessor systems, this implementation allows
flushing of the translation look-aside buffer (described later) without holding a lock for each page
being flushed.

后续写入该页面时，就不会出现异常，因为设置了硬写位。
而 MMU 还是会再次设置硬写位，尽管没有必要，但也无伤大雅，因为该页面的“写入”状态已经记录在工作集列表中。
强制让首次写入页面就经历这个异常处理，看似开销过大。然而，只要页面仍旧有效，它对于每个可写页面只发生一次。
此外，对几乎任何页面的首次访问都已经历了内存管理异常处理，因为页面通常初始化为无效状态（ PTE 中的位 1 被清除）。
假设首次访问一个页面就执行写操作，处理首次访问页错误时，在其内部会发生前述的“脏位”处理，由此引入的额外开销是很细微的。
（译注：整个一劳永逸的流程总结：首次写访问 -> 页不在物理内存中（无效） -> 处理缺页异常 -> 换入后只读 -> 处理写入异常 -> 后续顺畅写入）
最后，在单处理器和多处理器系统上，此一实现逻辑都允许在冲刷转换后备缓冲区（即 TLB，稍后讨论）时，
无需为每个被冲洗的页面都持有一把锁。

Byte Within Page

Once the memory manager has determined the physical page number, it must locate the requested
data within that page. This is the purpose of the byte offset field. The byte offset from the original
virtual address is simply copied to the corresponding field in the physical address. On x86 systems,
the byte offset is 12 bits wide, allowing you to reference up to 4,096 bytes of data (the size of a page).
Another way to interpret this is that the byte offset from the virtual address is concatenated to the
physical page number retrieved from the PTE. This completes the translation of a virtual address to a
physical address.

页内字节

一旦内存管理器确定出物理页号，它必须在该页内找到所请求的数据。这就是“字节偏移”域的作用。
就是简单地把原始虚拟地址中的“字节偏移”复制到物理地址中的对应字段。
在 x86 系统上，“字节偏移”为 12 位宽，这允许你引用最多 4,096 字节的数据（即一张页的大小）
另一种解释为：从该 PTE 中检索出物理页号，接着把虚拟地址中的“字节偏移”字段串联其后。
这就完成了从虚拟地址到物理地址的转换。

2017-6-3 17:56

0

shayi

雪币： 1604

活跃值： (640)

能力值：

( LV13，RANK：460 )

在线值：

发帖

23

回帖

269

粉丝

52

关注

私信

shayi 9: 6 楼

Translation Look-Aside Buffer

As you’ve learned so far, each hardware address translation requires two lookups: one to find the right
entry in the page directory (which provides the location of the page table) and one to find the right
entry in the page table.
Because doing two additional memory lookups for every reference to a virtual
address would triple the required bandwidth to memory, resulting in poor performance, all CPUs
cache address translations so that repeated accesses to the same addresses don’t have to be repeatedly
translated.
This cache is an array of associative memory called the translation look-aside buffer, or
TLB. Associative memory is a vector whose cells can be read simultaneously and compared to a target
value.
In the case of the TLB, the vector contains the virtual-to-physical page mappings of the most
recently used pages, as shown in Figure 10-20, and the type of page protection, size, attributes, and
so on applied to each page.
Each entry in the TLB is like a cache entry whose tag holds portions of the
virtual address and whose data portion holds a physical page number, protection field, valid bit, and
usually a dirty bit indicating the condition of the page to which the cached PTE corresponds.
If a PTE’s global bit is set (as is done by Windows for system space pages that are visible to all processes), the
TLB entry isn’t invalidated on process context switches.

转换后援缓冲区

正如你迄今所了解的，每个硬件地址翻译要求两次查询：第一次在页目录中找出正确的项（它给出页表的位置），第二次在页表
中找到正确的项。
由于为每个虚拟地址的引用执行两次额外的内存查找，将需要原本三倍到存储器的带宽，导致性能恶劣，所有 CPU 都会缓存地址翻译
的结果，使得重复访问相同地址无需屡次转换。
这个缓存是由“相联存储器”构成的数组，它叫做“转换后援缓冲区”，或简称 TLB。相联存储器是一个“向量”，它内部的单元可被
同时地读取，并与目标值进行比较。（译注：硬件 TLB 通常集成在 CPU 的一级高速缓存内部，因此不要把作者提到的相联“存储器” 与主存储器/内存
混淆，它们有着本质上的区别。）
就 TLB 而言，该向量包含最近使用过页面的虚拟-物理页映射，页保护的类型，大小，属性，诸如此类应用至每个页面的信息。。。。
如图 10-20 所示。

（译注：我说一些废话：既然有区分指令 L1 cache 与数据 L1 cache，它们各自就集成了指令 TLB 与数据 TLB，分别用来缓存
某指令所在的虚拟-物理页映射，以及某数据所在的虚拟-物理页映射，从而加速 CPU 取出指令与数据——以 Intel 的“Core”核心架构为例）
TLB 中的每一项类似于一个缓存条目，其“标签”部分持有虚拟地址，其“数据”部分持有物理页号，保护字段，有效位，通常
还有“脏位”，指出与缓存的 PTE 对应的页面处于何种状况。
假设一个 PTE 中的“全局”位被设置了（就像 Windows 为所有进程可见/共享的系统空间页面所设定的那样），相应的 TLB 条目
就不会在进程上下文切换期间（译注：根据前文，还有“冲刷”TLB 时）失效。

Virtual addresses that are used frequently are likely to have entries in the TLB, which provides
extremely fast virtual-to-physical address translation and, therefore, fast memory access.
If a virtual address isn’t in the TLB, it might still be in memory, but multiple memory accesses are needed to find
it, which makes the access time slightly slower.
If a virtual page has been paged out of memory or if
the memory manager changes the PTE, the memory manager is required to explicitly invalidate the
TLB entry.
If a process accesses it again, a page fault occurs, and the memory manager brings the
page back into memory (if needed) and re-creates its PTE entry (which then results in an entry for it
in the TLB).
频繁用到的虚拟地址很可能在 TLB 中有对应的条目，这就提供了极快的虚拟-物理地址转换过程，故而快速的内存访问（译注：只需
一次物理内存访问即可取出指令或代码，这就是为啥无论在所需时间和带宽上，都只有“循规蹈矩”地址翻译的三分之一。）
如果某个虚拟地址在 TLB 中没有对应的项，就可能仍在内存中，而需要多次主存储器访问才能找到，使得访问时间稍慢一些（与
页面被换出内存相比）。
如果虚拟页已经被换出主存储器，或者内存管理器变更了 PTE，它就需要显式地让相应的 TLB 条目失效。（译注：这很好理解：
PTE 描述的物理页变了，原 TLB 中记录的当然是错误的转换结果）
当一个进程再次访问已经被换出的虚拟页时，就会发生一次页错误，然后内存管理器把该页面换回内存（如有必要），并重建它的
PTE 项（而后导致在 TLB 中缓存相应的条目）。

2017-6-3 18:01

0

库尔

雪币： 1319

活跃值： (1960)

能力值：

( LV2，RANK：10 )

在线值：

发帖

3

回帖

190

粉丝

3

关注

私信

库尔: 7 楼

虽然已经知道内存管理原理，不过还是来看看漏掉什么

2017-6-3 19:37

0

zgzxp

雪币： 578

活跃值： (808)

能力值：

( LV2，RANK：10 )

在线值：

发帖

5

回帖

88

粉丝

0

关注

私信

zgzxp: 8 楼

膜拜中，翻译专业书籍是个巨大的工程啊，谢谢

2017-6-3 21:44

0

fengyunabc

雪币： 3738

活跃值： (3872)

能力值：

( LV4，RANK：50 )

在线值：

发帖

11

回帖

522

粉丝

26

关注

私信

fengyunabc 1: 9 楼

楼主辛苦了！

2017-6-3 23:06

0

MaYil

雪币： 7012

活跃值： (4222)

能力值：

( LV2，RANK：10 )

在线值：

发帖

1

回帖

259

粉丝

2

关注

私信

MaYil: 10 楼

感谢翻译, 辛苦了

2017-6-4 01:26

0

ugvjewxf

雪币： 615

活跃值： (590)

能力值：

( LV4，RANK：40 )

在线值：

发帖

22

回帖

352

粉丝

11

关注

私信

ugvjewxf: 11 楼

先感谢下，慢慢看，慢慢学，主要学习英文，

2017-6-4 05:44

0

shayi

雪币： 1604

活跃值： (640)

能力值：

( LV13，RANK：460 )

在线值：

发帖

23

回帖

269

粉丝

52

关注

私信

shayi 9: 12 楼

来看看真实世界中的 TLB 在 CPU 封装内是如何布局的，以 Intel Core I7 处理器为例子，如下图所示，可以看到， TLB 也按照指令和数据来划分，而且与硬件高速缓存一样，具有层次结构。注意，L1~L2 TLB 是虚拟寻址；而

L1~L3 Cache 则是物理寻址。CPU 翻译一个虚拟地址涉及发送虚拟页号（VPN）到 MMU，以及发送虚拟页字节偏移（VPO）到 L1 Cache，这两个发送是“并行”开展的，同时查找缓存在 TLB 中的物理页号（PPN）和 L1 Cache 中的物理页字节偏移（PPO）。PPN + PPO 就得出物理地址。视具体情况而定，也有可能像前面译文讲的那样，先从虚拟地址中复制字节偏移到物理地址中相应的字段，再用得出的物理地址到 L1 Cache 中查询有无相应条目缓存。但这个“顺序”查找就比“同时”查找慢了“一些”时钟周期。假设级联式的 TLB 都未命中，或者 Cache 都未命中，就需要到内存中查找，这就会慢“很多”时钟周期。

2017-6-5 20:17

0

shayi

雪币： 1604

活跃值： (640)

能力值：

( LV13，RANK：460 )

在线值：

发帖

23

回帖

269

粉丝

52

关注

私信

shayi 9: 13 楼

Physical Address Extension (PAE)

The Intel x86 Pentium Pro processor introduced a memory-mapping mode called Physical Address
Extension (PAE). With the proper chipset, the PAE mode allows 32-bit operating systems access to up
to 64 GB of physical memory on current Intel x86 processors (up from 4 GB without PAE) and up to
1,024 GB of physical memory when running on x64 processors in legacy mode (although Windows
currently limits this to 64 GB due to the size of the PFN database required to describe so much
memory). When the processor is running in PAE mode, the memory management unit (MMU) divides
virtual addresses mapped by normal pages into four fields, as shown in Figure 10-21. The MMU still
implements page directories and page tables, but under PAE a third level, the page directory pointer
table, exists above them.

物理地址扩展（PAE）

从 Intel x86 Pentium Pro 处理器开始，引入了一种叫做“物理地址扩展”（PAE）的内存映射模式。
搭配适当的芯片组，PAE 模式允许 32 位操作系统在当前的 Intel x86 处理器上，访问多达 64 GB 的物理内存（没有启用 PAE 则是 4GB）；
或者运行在 x64 处理器的传统模式上，访问多达 1,024 GB 的物理内存（尽管由于页框号数据库需要描述的物理内存太多，目前 Windows 把这个大小限制在 64 GB，从而避免页框号数据库本身过于庞大）。
（译注：请勿与 x64 处理器的原生寻址限制混淆——因为它不受 PAE 约束，而受可用地址总线位数的影响，详情请参考我翻译的第三部分“x64 Virtual Addressing Limitations”一节）
当处理器运行在 PAE 模式，它的内存管理单元（MMU）把映射正常页面的虚拟地址划分为四个字段，如图 10-21 所示。
MMU 仍旧实现了页目录与页表，但在 PAE 模式下，这是一个三级的数据结构，“页目录指针表”（PDPT）位于最上层。

One way in which 32-bit applications can take advantage of such large memory configurations is
described in the earlier section “Address Windowing Extensions.” However, even if applications are
not using such functions, the memory manager will use all available physical memory for multiple
processes’ working sets, file cache, and trimmed private data through the use of the system cache,
standby, and modified lists (described in the section “Page Frame Number Database”).

在先前的章节“地址窗口扩展”讲过，它是 32 位应用程序能够利用如此大量内存配置的一种方式。然而，即使应用程序不使用这样的功能，内存管理器也会把所有可用的物理内存用在多进程工作集，文件缓存，以及通过使用系统缓存，备用和已修改列表来裁剪私有数据。。。等此类事务上。
（在“页框号数据库”一节中，会有详细讨论）

PAE mode is selected at boot time and cannot be changed without rebooting. As explained in
Chapter 2 in Part 1, there is a special version of the 32-bit Windows kernel with support for PAE
called Ntkrnlpa.exe. Thirty-two-bit systems that have hardware support for nonexecutable memory
(described earlier, in the section “No Execute Page Protection”) are booted by default using this PAE
kernel, because PAE mode is required to implement the no-execute feature. To force the loading of
the PAE-enabled kernel, you can set the pae BCD option to ForceEnable.

PAE 模式是在系统引导时刻就选择好的，需要重启才能禁用它。
正如本书上册第二章所述，有一个特殊版本的 32 位 Windows 内核映像，它支持 PAE，即 Ntkrnlpa.exe
具有硬件支持“不可执行内存”的 32 位系统（在前面的“不可执行页保护”一节讲过），默认就使用 Ntkrnlpa.exe 这个 PAE 内核来启动，因为需要 PAE 模式来实现不可执行功能。
要强制加载启用 PAE 的内核，你可以把 BCD 的 pae 选项设置为“ForceEnable”。（译注：作者比较懒，相关的知识点都引用前文，因此还是请参考我翻译的第一部分相关内容，以便把握全局）

Note that the PAE kernel is installed on the disk on all 32-bit Windows systems, even systems with
small memory and without hardware no-execute support. This is to allow testing of PAE-related code,
even on small memory systems, and to avoid the need for reinstalling Windows should more RAM be
added later. Another BCD option relevant to PAE is nolowmem, which discards memory below 4 GB
(assuming you have at least 5 GB of physical memory) and relocates device drivers above this range.
This guarantees that drivers will be presented with physical addresses greater than 32 bits, which
makes any possible driver sign extension bugs easier to find.

请注意，PAE 内核会安装在所有 32 位 Windows 系统的磁盘上，即使那些只有很少内存，以及没有硬件不可执行支持的系统，也不例外。
这是为了在小内存系统上，也能够测试 PAE 相关的代码，以及避免重装 Windows 后，需要添加更多 RAM 时（总量大于 4 GB），所带来的不便。
另一个与 PAE 相关的 BCD 选项是“nolowmem”，它会丢弃低于 4 GB 的内存（假设你至少拥有 5 GB 的物理内存），并且将设备驱动程序重定位到此范围以上（ 4 GB 以上）。这就保证了驱动程序将以大于 32 位的物理地址来呈现，从而更易于发现任何可能存在的驱动程序签名扩展 bug。

To understand PAE, it is useful to understand the derivation of the sizes of the various structures
and bit fields. Recall that the goal of PAE is to allow addressing of more than 4 GB of RAM. The 4-GB
limit for RAM addresses without PAE comes from the 12-bit byte offset and the 20-bit page frame
number fields of physical addresses: 12 + 20 = 32 bits of physical address, and 232 bytes = 4 GB. (Note
that this is due to a limit of the physical address format and the number of bits allocated for the PFN
within a page table entry. The fact that virtual addresses are 32 bits wide on x86, with or without PAE,
does not limit the physical address space.)
Under PAE, the PFN is expanded to 24 bits. Combined with the 12-bit byte offset, this allows addressing
of 224 + 12 bytes, or 64 GB, of memory.

明白各种结构和位域大小的推导过程，对于理解 PAE 是很有帮助的。
回顾 PAE 的目标，它是为了允许寻址超过 4 GB 的物理内存引入的。没有启用 PAE 时，随机访问存储器的 4 GB 地址限制，源自于物理地址中的 12 位字节偏移和 20 位页框号字段的长度：12 + 20 = 32 位物理地址，2 ^ 32 字节 = 4 GB。
（请注意，这是由于物理地址格式，以及在页表项内为 PFN 分配的比特位数限制所致。虚拟地址在 x86 架构上是 32 位宽这一事实表明，物理地址空间不会受限于 PAE 的有无。）
在 PAE 模式下，页框号扩展到了 24 位。它与 12 位的字节偏移结合后，就能够寻址 2 ^ (24 + 12) 字节，也就是 64 GB 的物理内存。

To provide the 24-bit PFN, PAE expands the PFN fields of page table and page directory entries
from 20 to 24 bits. To allow room for this expansion, the page table and page directory entries are
8 bytes wide instead of 4. (This would seem to expand the PFN field of the PTE and PDE by 32 bits
rather than just 4, but in x86 processors, PFNs are limited to 24 bits. This does leave a large number of
bits in the PDE unused—or, rather, available for future expansion.)
Since both page tables and page directories have to fit in one page, these tables can then have
only 512 entries instead of 1,024. So the corresponding index fields of the virtual address are accordingly
reduced from 10 to 9 bits.

为了能够提供 24 位的 PFN，PAE 把页表项和页目录项的 PFN 字段，从 20 位扩展成了 24 位。
相应地，为了给此一扩展留出空间，页表项和页目录项的宽度现在变成了 8 字节，而不是 4 字节。
（这看似能够将 PTE 和 PDE 中的 PFN 字段扩展 32 位，变成 54 位，以充分利用 8 字节的条目，而不仅仅是扩展 4 位，变成 36 位。但是在 x86 处理器中，PFN 被限制为 24 位。这确实在 PDE 中留下了大量未使用的比特——或者说，它们可用于将来的扩展）
既然页表自身和页目录自身都需要按页对齐（放在一页中），这些表格就可以只有 512 个条目，而非原来的 1,024 个条目。
（译注：因为每个条目现在是 8 字节，一张页内只能存放 512 个条目）
所以，虚拟地址中相应的页目录索引和页表索引字段也就跟着从 10 位减少到了 9 位（2 ^ 9 = 512）
（译注：请参考图 10-21，“页目录指针索引”字段占用 2 位，加上两个 9 位的子索引字段，一共 20 位，和非 PAE 模式下的宽度一致）

This then leaves the two high-order bits of the virtual address unaccounted for. So PAE expands
the number of page directories from one to four and adds a third-level address translation table,
called the page directory pointer table, or PDPT. This table contains only four entries, 8 bytes each,
which provide the PFNs of the four page directories. The two high-order bits of the virtual address are
used to index into the PDPT and are called the page directory pointer index.

我们剩下虚拟地址中最高的 2 位还没解释。
因此，PAE 把页目录的数量从一个扩展到了四个，并且增加了一个第三级的地址转换表，叫做页目录指针表，或简称 PDPT。
它只包含四个条目，每个大小 8 字节，它们用来提供四个页目录的 PFN。
现在你就能够推测到，虚拟地址中的两个最高位，用于索引 PDPT 内的条目，叫做页目录指针索引。

As before, CR3 provides the location of the top-level table, but that is now the PDPT rather than
the page directory. The PDPT must be aligned on a 32-byte boundary and must furthermore reside in
the first 4 GB of RAM (because CR3 on x86 is only a 32-bit register, even with PAE enabled).
Note that PAE mode can address more memory than the standard translation mode not directly
because of the extra level of translation, but because the physical address format has been expanded.
The extra level of translation is required to allow processing of all 32 bits of a virtual address.

CR3 依旧给出了顶级“表格”的位置，该“表格”现在是 PDPT，而非页目录。
PDPT 必须在 32 字节的边界上对齐，此外还需要驻留在首个 4 GB 的物理内存中。
（这是由于，x86 的 CR3 寄存器只有 32 位，和 PAE 启用与否无关）
请注意，PAE 模式能够寻址比标准翻译模式更多的内存，其直接原因并非是追加了额外翻译层的关系，而是由于物理地址的格式被扩展了。
因此，需要额外的翻译层，才能够处理一个虚拟地址中的全部 32 位。

2017-6-9 17:44

0

shayi

雪币： 1604

活跃值： (640)

能力值：

( LV13，RANK：460 )

在线值：

发帖

23

回帖

269

粉丝

52

关注

私信

shayi 9: 14 楼

EXPERIMENT: Translating Addresses

To clarify how address translation works, this experiment shows a real example of translating
a virtual address on an x86 PAE system, using the available tools in the kernel debugger
to examine the PDPT, page directories, page tables, and PTEs. (It is common for Windows on
today’s x86 processors, even with less than 4 GB of RAM, to run in PAE mode because PAE
mode is required to enable no-execute memory access protection.) In this example, we’ll work
with a process that has virtual address 0x30004, currently mapped to a valid physical address. In
later examples, you’ll see how to follow address translation for invalid addresses with the kernel
debugger.
First let’s convert 0x30004 to binary and break it into the three fields that are used to translate
an address. In binary, 0x30004 is 11.0000.0000.0000.0100. Breaking it into the component
fields yields the following:

实验：地址转译

本实验会演示在一个 x86 PAE 系统上转译虚拟地址的真实案例，以便把地址转译的原理说清楚讲明白，使用内核调试器中可用的工具（命令），来审查 PDPT，页目录，页表，以及页表项。
（如前所述——作者比我还罗嗦——在当今启用了 PAE 模式的 x86 处理器上运行 Windows 是很常见的，甚至少于 4 GB 物理内存的系统也运行在 PAE 模式下，因为需要 PAE 模式才能够启用不可执行内存的访问保护。）
在这个例子中，我们将要剖析的进程虚拟地址为 0x30004，当前被映射到一个有效的物理地址。而在稍后的例子中，你将看到如何通过内核调试器来为无效地址跟踪它的转换过程。
1。首先，我们把 0x30004 转换为二进制表示，再将其分解成用于翻译地址的三个字段。0x30004 的二进制表示为“11.0000.0000.0000.0100”。分解后的组成字段如下图所示：

To start the translation process, the CPU needs the physical address of the process’s page
directory pointer table, found in the CR3 register while a thread in that process is running. You
can display this address by looking at the DirBase field in the output of the !process command,
as shown here:

为了启动转换过程，CPU 需要该进程的页目录指针表的物理地址，这个进程内的一个线程在运行时，可以从 CR3 寄存器中找到该地址。你可以在“!process”命令的输出中，查看“DirBase”字段，它显示出页目录指针表的物理地址，如下所示：

lkd> !process -1 0
PROCESS 852d1030	SessionId: 1	Cid: 0dec		Peb: 7ffdf000	ParentCid: 05e8
	DirBase: ced25440	ObjectTable: a2014a08	HandleCount: 221.
	Image: windbg.exe

The DirBase field shows that the page directory pointer table is at physical address
0xced25440. As shown in the preceding illustration, the page directory pointer table index field
in our example virtual address is 0. Therefore, the PDPT entry that contains the physical address
of the relevant page directory is the first entry in the PDPT, at physical address 0xced25440.
As under x86 non-PAE systems, the kernel debugger !pte command displays the PDE and
PTE that describe a virtual address, as shown here:

2。上面输出中的 DirBase 字段显示出页目录指针表位于物理地址 0xced25440。
如前一张插图所示，虚拟地址中，页目录指针表索引字段的值为 0。因此，含有相关页目录物理地址的 PDPT 项（PDPTE）就是该 PDPT 中的第一个条目，位于物理地址 0xced25440。
与在 x86 非 PAE 系统上一样，内核调试器的“!pte”命令能够显示描述虚拟地址的 PDE 和 PTE，如下所示：

lkd> !pte 30004
				VA 00030004
PDE at C0600000				PTE at C0000180
contains 000000002EBF3867		contains 800000005AF4D025
pfn 2ebf3 ---DA--UWEV			pfn 5af4d ----A--UR-V

The debugger does not show the page directory pointer table, but it is easy to display given
its physical address:

虽然上面输出中没有显示出页目录指针表的内容，但是通过调试器扩展命令“!dq”，后接上“表格”的物理地址，就能够对其进行查看：

lkd> !dq ced25440 L 4
#ced25440 00000000`2e8ff801 00000000`2c9d8801
#ced25450 00000000`2e6b1801 00000000`2e73a801

Here we have used the debugger extension command !dq. This is similar to the dq command
(display as quadwords—“quadwords” being a name for a 64-bit field; this came from the day
when “words” were often 16 bits), but it lets us examine memory by physical rather than virtual
address. Since we know that the PDPT is only four entries long, we added the L 4 length argument
to keep the output uncluttered.

“!dq”命令与“dq”命令类似（按照“四字”的格式输出——“四字”已演变为 64 位字段的一个别名；这可以追溯到当“字”成为一个 16 位值同义词的那天起），但是“!dq”命令允许我们查看物理内存地址，而不是虚拟内存地址。
既然我们已知 PDPT 内只有四个条目，就可以在命令后添加“L 4”长度参数，让输出保持整洁易读。

As illustrated previously, the PDPT index (the two most significant bits) from our example
virtual address equal 0, so the PDPT entry we want is the first displayed quadword. PDPT entries
have a format similar to PD entries and PT entries, so we can see by inspection that this one
contains a PFN of 0x2e8ff, for a physical address of 2e8ff000. That’s the physical address of the
page directory.

3。如前面那张插图所示，虚拟地址中的 PDPT 索引（两个最高有效位）等于 0，所以第一个显示的“四字”就是我们要找的 PDPTE。
PDPTE 的格式与 PDE 和 PTE 很相似，因此检查后我们知道，首个 PDPTE 的 PFN 为 0x2e8ff，它描述的物理地址从 0x2e8ff000 开始——亦即该页目录的物理地址。（0x801 是所在物理页面的状态标志）

The !pte output shows the PDE address as a virtual address, not physical. On x86 systems
with PAE, the first process page directory starts at virtual address 0xC0600000. The page directory
index field of our example virtual address is 0, so we’re looking at the first PDE in the page
directory. Therefore, in this case, the PDE address is the same as the page directory address.

4。前面“!pte”命令输出 PDE 的虚拟地址，而非物理地址。
在开启 PAE 的 x86 系统上，该进程的第一个页目录从虚拟地址 0xC0600000 开始。如前面那张插图所示，虚拟地址中的页目录索引字段为 0，因此我们需要寻找“第一个页目录中的第一个 PDE”。既然这样，首个 PDE 的地址就与首个页目录地址相同。

As with non-PAE, the page directory entry provides the PFN of the needed page table; in
this example, the PFN is 0x2ebf3. So the page table starts at physical address 0x2ebf3000. To
this the MMU will add the page table index field (0x30) from the virtual address, multiplied by 8
(the size of a PTE in bytes; this would be 4 on a non-PAE system). The resulting physical address
of the PTE is then 0x2ebf3180.

5。与非 PAE 下的情况相同，PDE 提供了所需页表的 PFN；在本例中是 0x2ebf3。因此，对应的页表从物理地址 0x2ebf3000 处开始。
6。接下来，MMU 会把（如前面那张插图所示）虚拟地址中的页表索引字段值（0x30）乘以 8（PTE 的大小，以字节为单位，在非 PAE 系统上是 4 字节），得出 0x180，因此“第一个页目录中，第一张页表中，第 0x30 号 PTE 的物理地址为 0x2ebf3180”。

The debugger shows that this PTE is at virtual address 0xC0000180. Notice that the byte
offset portion (0x180) is the same as that from the physical address, as is always the case in
address translation. Because the memory manager maps page tables starting at 0xC0000000,
adding 0x180 to 0xC0000000 yields the virtual address shown in the kernel debugger output:
0xC0000180. The debugger shows that the PFN field of the PTE is 0x5af4d.

前面调试器的“!pte”命令输出显示，第 0x30 号 PTE 被映射到虚拟地址 0xC0000180。
注意，该虚拟地址的字节偏移部分（0x180）与物理地址（0x2ebf3180）中的一致，这是地址翻译秉持的一贯做法。
因为内存管理器把“第一个页目录中的第一张页表”映射到从虚拟地址 0xC0000000 开始，把它加上字节偏移 0x180，就产出了内核调试器输出中的虚拟地址 0xC0000180。
7。根据前面调试器的“!pte”命令输出得知，第 0x30 号 PTE 描述的物理页的 PFN 为 0x5af4d。（该页面从 0x5af4d000 开始）

Finally, we can consider the byte offset from the original address. As described previously,
the MMU will concatenate the byte offset to the PFN from the PTE, giving a physical address
of 0x5af4d004. This is the physical address that corresponds to the original virtual address of
0x30004—at the moment.

8。最后，我们就可以来考虑原始地址（如前面那张插图所示）中的字节偏移。如前所述，MMU 会把字节偏移串联在 PTE 中的 PFN 字段之后，从而给出物理地址 0x5af4d004。这就是“眼前”对应于原始虚拟地址 0x30004 的物理地址。

The flags bits from the PTE are interpreted to the right of the PFN number. For example,
the PTE that describes the page being referenced has flags of --A--UR-V. Here, A stands for
accessed (the page has been read), U for user-mode accessible (as opposed to kernel-mode
accessible only), R for read-only page (rather than writable), and V for valid (the PTE represents
a valid page in physical memory).

PFN 号的右侧被解释成该 PTE 中的标志位（译注：即 0x025）。例如，描述被引用页面的 PTE 标志为 --A--UR-V。
此处，A 代表已访问（该页已被读取过），U 代表用户模式可访问（而非仅在内核模式下可访问），R 代表只读页（而非可写），V 代表有效（即，该 PTE 表示物理内存中的一个有效页）

To confirm our calculation of the physical address, we can look at the memory in question
via both its virtual and its physical addresses. First, using the debugger’s dd command (display
dwords) on the virtual address, we see the following:

为了证实我们的“手动”物理地址计算结果，我们可以同时检查该虚拟地址和物理地址处的相关内存内容，如果一致，表明翻译结果是正确的。首先，对该虚拟地址使用调试器的“dd”命令（显示“双字”），如下所示：

lkd> dd 30004
00030004 00000020 00000001 00003020 000000dc
00030014 00000000 00000020 00000000 00000014
00030024 00000001 00000007 00000034 0000017c
00030034 00000001 00000000 00000000 00000000
00030044 00000000 00000000 00000002 1a26ef4e
00030054 00000298 00000044 000002e0 00000260
00030064 00000000 f33271ba 00000540 0000004a
00030074 0000058c 0000031e 00000000 2d59495b

And with the !dd command on the physical address just computed, we see the same
contents:

接着对我们刚才计算出的物理地址使用“!dd”命令，得出下面内容：

lkd> !dd 5af4d004
#5af4d004 00000020 00000001 00003020 000000dc
#5af4d014 00000000 00000020 00000000 00000014
#5af4d024 00000001 00000007 00000034 0000017c
#5af4d034 00000001 00000000 00000000 00000000
#5af4d044 00000000 00000000 00000002 1a26ef4e
#5af4d054 00000298 00000044 000002e0 00000260
#5af4d064 00000000 f33271ba 00000540 0000004a
#5af4d074 0000058c 0000031e 00000000 2d59495b