From: Raz Mathias (razvanma_at_exchange.microsoft.com)
Date: Wed Feb 04 2004 - 16:18:36 PST
The Emerald paper presents a distributed operating system, in a fairly novel sense. The system is based upon a rudimentary form of objects (minus inheritance and subclass polymorphism) comprising of code and data that can both move across machine boundaries, breaking with the traditional meaning of the word "process."
The paper begins by listing the goals of a distributed operating system and moving processes across machine boundaries. Reading the list of goals a second time (after reading the rest of the paper), I found that the goals and potential distributed object-oriented solutions to be particularly pragmatic. The idea of distributed load balancing via roaming objects really does nail what I believe to be the important tradeoff of throughput over latency in distributed systems. The idea of improved communication via the collocation of the communicating parties was noble, but not convincingly addressed by the paper's mechanisms and experiments (more on this later). The issue of increased availability can only be partially addressed by distributed objects. Redundancy and efficient restart are key to availability; the way I could see increasing availability in Emerald was to send the computation to a machine that can restart the faulting "process" quicker. Redundancy requires copying, which wasn't addressed by the mechanisms in the paper. Distributed reconfiguration seemed like a fairly great idea that isn't being leveraged today in the commercial operating systems; when a machine becomes "faulty" why can't I move all objects in a good state (including code) over to a second machine and simply reformat the first? Unfortunately, today's commercial systems only allow me to back up data easily (remember the registry in Windows?)
The Emerald system comes with its own programming language. Programmers can define objects using semantics that are common no matter whether the object will be global, local, or remote. The underlying mechanism handles the implementation of the procedure calls in an efficient manner, using shared memory on a local machine and message passing across machines without burdening the creator of the distributed objects with the details of which. The creator simply determines what objects need to be kept together (via an attach primitive) when they move. In addition, there is no concept of a class in Emerald. Conceptually, every object has its own code and the Emerald system handles the mechanics of code sharing as an implementation detail.
The central theme in the Emerald system is the fact that objects can freely move across machines to achieve the aforementioned set of high-level goals. The system uses templates to allow it to understand and reconstruct the object's data and rebind member references on the destination machine. The idea of a distributed process was introduced by allowing a process's stack of activations to be spread across multiple machines. Unlike with the pure RPC mechanism all of these machines are peers and therefore passing functions as parameters is possible because every machine is essentially a server (i.e. lifetimes of each is indefinite). If an object moves, it leaves behind a forwarding record which other objects can use to find it.
The paper introduces the problem of call-by-reference in a distributed setting, where a call to one machine may result in a number of calls to several others, each of which may call even more machines, etc. The paper introduces two interesting mechanisms dubbed "call-by-value" and "call-by-reference." These mechanisms allow objects to coalesce at a particular machine, thereby forcing any further inter-object-communication to be local procedure calls.
I believe that the distributed object operating system introduced by the paper has some great mechanisms and heads towards the direction of the distributed object goals specified above. It comes short in one very important respect: all of the policies in the paper don't seem to take into consideration the emergent behavior of the system as a whole. A few times, the paper makes the tradeoff of preferring the optimization of procedure calls over object moves, yet it was unclear (and not really discussed) as to how often object moves would actually occur. It seems to me that without some sort of general movement-damper-ing policy, the system may continue to move objects around indefinitely. Ideally what we would really like is for Emerald to eventually "figure out" that the system as a whole intercommunicates most optimally in a particular configuration and perhaps at that moment, movement should be limited to goals outside of communication performance (e.g. for availability or reconfiguration). I believe that the results given in the experimental section were not compelling largely because of this failure to address the emergent policies from the local, often greedy algorithms. This, along with the topic of copying objects (in addition to moving them), and relaxing the trusted or homogenous requirements could be solid foundations for future papers.
This archive was generated by hypermail 2.1.6 : Wed Feb 04 2004 - 16:18:40 PST