From: Praveen Rao (psrao_at_windows.microsoft.com)
Date: Fri Feb 13 2004 - 00:16:51 PST
This paper discusses the design of virtual memory management in Mach
operating system. The salient feature of Mach virtual memory management
is its portability which is achieved through separation of machine
independent and machine dependent parts of management. Machine dependent
part is kept at minimum and has a narrow well defined interface.
Authors argue that they can achieve such portability without sacrificing
performance. They publish numbers comparing Mach to Unix which show Mach
performing better than Unix for most part despite the portability
features. I would have liked to see comparisons of these numbers with
numbers for VAX/VMS memory management system which is more sophisticated
than traditional Unix.
Authors start with making a case for need of portability in the face of
proliferation of processor architectures.
In Mach task is the execution environment (analogous to Unix process), a
thread is the basic unit of CPU utilization, a port is a communication
channel which acts as a queue for messages protected by kernel, a
message is a typed collection of data objects and a memory object is
collection of data that can be mapped into the address space of a task.
Mach uses message passing for performing operations on objects other
than messages. Despite having a messaged based system, efficiency is
gained through integrating virtual memory management with
message-oriented communication facility. This integration allows large
amounts of data (e.g a whole file or a whole address space) to be sent
in a message with the efficiency of simple memory remapping.
Each Mach process (task to be more precise) is a large address space
that consists of mapping between the process address and memory objects.
The address space can be as large as the hardware allows. The page size
too is not constrained to a hardware page size (in can be any 2 multiple
of the hw page size) and is a boot-time parameter. Mach allows both
copy-on-write and read/write sharing sharing between processes.
Copy-on-write sharing is typically a result of large message transfer -
in such case arbitrarily large address space can be sent in a single
message without any actual data copy operations. Read/write sharing can
be done by allocating a memory region and setting its inheritance
attribute (e.g. shared/copy/none). In a similar manner protection is
specified on a per-page basis.
Four basic memory management data structures are used in Mach:
* The resident page table (information about machine independent pages)
* The address map (entries describing mapping from a range of addresses
to a region of memory object
* The memory object (backing store managed by the kernel or a user task)
* pmap (This is the machine dependent data structure)
To me, the memory object seems to be the indirection central to the
portability.
Physical memory in Mach is just a cache for content of virtual memory
objects. Physical page info (e.g referenced/modified) is maintained in a
table indexed by physical page number. Each page entry can be linked
into various lists:
* a memory object list
* a memory allocation queue
* a object/offset hash bucket
Each physical page can belong to at most one memory object.
Another prominent feature of Mach is to allow a custom pager for memory
objects. Each memory object has a pager associated with it which can be
a custom pager or the default kernel pager.
Authors state that since Mach VM system is portable it can be used to
compare various processor architectures. Pros and cons cited in various
systems are: VAX causes a large page table due to its small page size,
IBM RT PC allows only one valid mapping of a physical page creating
problems for sharing (though due to its reverse mapping memory used by
page table is constrained, which is a plus), SUN3 causes sparse resident
page tables etc.
Authors also talk about the issue of keeping TLBs coherent in
multiprocessors. The options discussed are: forcibly making CPUs flush
their TLB (used in case of time critical changes), postpone use of new
mapping until all the CPUs have taken a timer interrupt and flushed
their TLBs (used during pageout) and tolerating inconsistency (can be
used in case of protection change).
This paper seems to be an important work in the area of VM portability.
As I mentioned earlier I would like to see perf comparison between this
system with a highly optimized VM system for tuned to a particular
hardware. This is important particularly that now we are back to a world
of limited number of predominant processor architectures.
This archive was generated by hypermail 2.1.6 : Fri Feb 13 2004 - 00:17:15 PST