From: isking_at_u.washington.edu
Date: Wed Jan 14 2004 - 16:18:15 PST
"Sharing and Protection" describes the theory and implementation of the Opal operating system. This research explores the feasibility of running an OS and user tasks in a single, large virtual memory space, instead of separate, conceptually parallel memory spaces as in most modern operating systems. The premise is supported by the argument that the large memory space required is feasible because of modern 64-bit-address processors and memory management hardware. The system is implemented as a task on the Mach operating system, with theoretical constructs implemented through conventional UNIX mechanisms.
This paper is premised on a very sound and effective model: challenge one assumption of existing technology and evaluate that new universe. The assumption which Chase et al. challenge is that protection between processes should be implemented through conceptually separate virtual memory spaces. This assumption precludes a process from accessing objects in another process through the simple expediency that the second process' memory space is meaningless in the first process. However, sharing between processes is a common need in real applications, and this requires creation of mechanisms that effectively subvert the protection. These mechanisms have been demonstrated to introduce defects and security compromises.
On the other hand, Opal assumes that each process exists in the same conceptual virtual memory space. Protection is provided by the processor's memory management hardware, which disallows memory accesses that violate the system's security policy. Sharing of objects simply requires informing the memory management hardware that process X may now access a particular region of virtual memory in whatever mode is to be allowed. Processes may freely pass pointers, which this paper argues are a pervasive element of current programming practice; a pointer always represents a valid address, regardless of whether the current holder of that pointer may actually access information at that address. Both data and procedures may be accessed through these global pointers; the operating system also supports protected procedure calls, called portals, that closely model the protected procedure calls of earlier literature. Because all addresses are virtual, it also allows that an object may not exist in immediately accessible storage at any given time, but may be 'faulted' into storage; this allows not only for a virtual space larger than the physical memory space of the machine (a common practice in virtual memory systems), but also for a persistent means of addressing objects that exist on any storage device and even on other systems, so long as an appropriate marshalling mechanism exists.
The system depends heavily (and apparently, quite consciously) on longstanding concepts such as segments as the fundamental object and capability passing for security. (It is interesting that capabilities are managed by a name server that employs ACLs - the two related concepts offer different advantages.) The analog of a process under Opal is the protection domain, which may be the execution environment for one or more threads. Protection domains are built on one or more segments. Opal interprets segments a bit more concretely than e.g. Dennis and Van Horn; they are collections of memory pages, and are intended to provide a coarse-grained level of control to memory. The authors feel that the operating system should not be in the mundane business of dynamic object storage at the thread or program level; in illustration of their definition of "coarse grained", the authors suggest, for instance, that a protection segment might well contain the heap for a thread or set of threads, as established by a given programming language.
The authors address issues that arise in a single memory space, such as multiple use of procedures that contain static data. While the procedure loads (is 'fixed up' to) a specific and persistent memory address in the virtual memory space, static data must be replicated per instance (i.e. if the module is shared between protection domains); they offer the details of a hardware-level mechanism (register-relative addressing of the data) to accomplish this. The description handles most of the challenges it raises; the unique and persisting challenges of the single-address-space model are addressed in a later section where, for instance, the problem of copying a data structure that may be internally self-referent (i.e. contain pointers to structures within itself, that must somehow be modified when the entire structure is moved to a new base address in virtual memory). There are apparently few such issues - no more than those raised by the separate-address-space model.
The paper offers details of the implementation of Opal on top of the Mach microkernel, which is quite illuminating as to the distinctions between the UNIX model and that of Opal. In my opinion, the authors made appropriate choices of 'short cuts' in the implementation, and did not compromise the principles they intended to demonstrate.
Many papers stop here, but the authors also provide data based on an experimental implementation of a real-world application, a large database/CAD application. References to a very large number of data objects, previously implemented through serialized accesses of a database server process, were reduced to memory accesses, which presumably address-faulted for objects not present in working memory, but at less cost than table accesses in a conventional database. This section was quite exhaustive, and addressed a higher level synchronization issue through a low-level structure built on Opal primitives, the mediator. Performance of the Opal-based implementation was on par with the conventional implementation, with greater ease of access, sharing and synchronization (i.e. database integrity).
It is interesting to note that Windows CE implements some of the "best of both worlds" in this regard. The system is limited to 32 processes, which was considered acceptable in an embedded operating system for use in portable devices. All 32 processes are mapped into sequential 32MB segments of virtual memory; the kernel consumes one process. Slot 0 is reserved for the currently running process; process switching is accomplished through "address arithmetic" that remaps a given process slot into slot 0. Therefore, an active process is always in slot 0, and can make certain programming assumptions based on its location in virtual memory; however, all processes are in fact in memory simultaneously and, through provided system calls, can access each others' memory space. In particular, the VirtualCopy function creates a virtual memory pointer within the calling process that aliases to the physical memory committed to its argument virtual pointer. With appropriate protections, this facilitates interprocess communication through shared memory.
This archive was generated by hypermail 2.1.6 : Wed Jan 14 2004 - 16:18:17 PST