Paging
Paging Basics
- divide a process's virtual memory into fixed sized pages (typically 4KB)
- divide the physical memory into page sized chunks
- each chunk of physical memory is called frame or physical page
- only load pages in use into the frames, dynamically load pages as needed
- how would we tell what pages are needed?
Address Translation With Paging
- base & bound: process's virtual memory is located contiguously in physical memory
- paging: process's virtual memory (pages) may not be contiguous in physical memory
- virtual memory still has a contiguous view
- but translation needs to be done at a page level
- proposal: a page translation table
- track page to frame mapping in a table
- use information within the virtual address to perform table look up
- virtual addresses within a 4KB chunk are in the same page
- 4096 (4KB) = 2^12 bits, so the lower 12 bits of a virtual address is an offset into a page
- the higher bits can then be used as a page number
- page number as index, physical address as entry
- how many memory accesses are there? who should perform the memory translation?
- many accesses, hardware does the memory translation, also called the page table walk
- architecture specifies the page table format, kernel sets it up, load the table address into a page table base register (
%cr3
)
- do we load physical or virtual address into
%cr3
?
- memory translation itself causes access to physical memory which has latency cost
- Translation Lookaside Buffer (TLB): cache the result of translation to reduce the cost
- upon a memory access, hardware checks the TLB and starts the page table walk in parallel
- since TLB is a cache, the result will come back sooner, if there's a hit, translation completes
- if not, waits for the page table walk and caches the result to TLB
Costs of Page Tables
- how much space is taken up by the page table?
- # of entries = size of virtual address space / page size
- page table has an entry for each page (very large array)
- what if a process only use a couple pages of its virtual memory?
- still needs to pay the cost of the entire array
- how many page tables do we need?
- one per process? one per system?
- where are they stored? kernel memory?
- how to use less space for page table?
- super pages: make pages larger
- there's hardware support for 2MB (512 4K pages) and 1GB (262144 4K pages) pages
- same virtual address space, larger pages -> fewer pages -> smaller page tables
- also good for performance (less translations needed)
- any problem with larger page size?
- still need a page table for every process
- inverted page table
- instead of maintaining per process address, maintain one global table that's indexed by frame number
- index = frame number, entry = pid + page number
- # of entries = size of physical memory / page size
- how would we perform a look up given a virtual address (page number)?
- search through each entry until we find a matching pid + page number
- if no matching entry, page fault
- slow look up! what to do?
- use a hash function to hash pid + page number to a particular frame
- how to handle hash collision?
- costs of performing look up?
- multilevel page table
- maybe the original per-process page table idea is still good, but how to use less memory?
- with a single array, all entries must be allocated at once, but what if we use hierarchical arrays?
- indirection can help space saving when the entries (pages) are sparse and adjacent
- does this always result in less memory used? what if every page gets accessed?
x86-64 Address Translation
- architecture specification defines format of the page table
- x86-64 page table: 4 level page table
- Page Map Level 4 (PML4): top level page table, each entry stores the physical address of a PDPT
- Page Directory Pointer Table (PDPT): 2nd level page table, each entry stores the physical address of a PDT
- Page Directory Table (PDT): 3rd level page table, each entry stores the physical address of a PT
- Page Table (PT): last level page table, each entry stores the physical address of a frame
- each table fits in a frame (4KB), each table entry is 8 bytes
- 4096 (table size) / 8 (entry size) = 512 (entries)
- each table is indexed with 9 bits of the virtual address
- what does the 8 byte page table entry look like?
- page table entry
- bit 0-11 contain information about the page (bit 0: present, 1: writable, 2: user accessible)
- bit 12-47 contain the physical page number of the frame
- bit 48-63 contain either reserved field or other permission info about the page
- kernel sets up each page table entry with proper permissions
- access & dirty bits are set by the hardware when an access or a write is performed
- may also be modified by the kernel
- the kernel can use these bits to make paging policy decisions
- what errors may occur during a hardware page table walk?
- a NULL entry! may happen in any level of the page table, what does this mean?
- mismatching permission! user trying to read kernel address, writing to a read only page
- what can the hardware do?
- kill a process? how does it know what resources to clean up?
- when in doubt, transfer control to the OS!
Page Fault
- an exception that is raised by the hardware when something unexpected happens in the page table walk
- how should the kernel handle a page fault?
- identify and handle valid page faults
- stack or heap growth
- memory mapped files
- known permission mismatch
- memory pressure (access to swapped pages)
- terminate threads with invalid page faults
- nullptr, random address in unallocated virtual memory
- actual permission mismatch
- needs bookkeeping structures to track metadata for each page
- machine independent VM metadata vs machine dependent page table
- machine independent structures in xk:
vspace
, vregion
, vpage_info
- track the size of each region (stack, heap, code), if a page is associated to any file, if a page is cow
- updated and referenced by the kernel when performing virtual memory related system calls and in handling page fault
- can be more complex, not on the critical path of each memory access
- machine dependent structures in xk: the x86-64 page table
x86_64vm.c
- architecture specified page table, must be efficient for access
vspaceupdate
update a machine dependent page table based on the machine independent metadata