From: David Coleman (dcoleman_at_cs.washington.edu)
Date: Tue Feb 17 2004 - 17:18:16 PST
Memory Coherence in Shared Virtual Memory Systems presents a fairly
comprehensive overview of approaches to this problem. The first thing
that tripped me up while reading this was the assumption I made that the
point of shared virtual memory is to expand the address space. Instead,
I now believe the purpose of this research was to implement effective
memory sharing instead of expanding the native address space of the
hardware. Comments regarding the viability of implementing this with the
MMU seem to indicate this. I did appreciate the discussion of the
different approaches possible and the motivation for honing in on a
basic set of algorithms. The overall discussion of the problem of memory
coherence was interesting in light of our earlier discussions of cache
coherence in shared memory multiprocessor systems.
I would like to have seen a more rigorous discussion of granularity.
More empirical data would have been useful. Often these are the types of
issues that degrade otherwise quality implementations and is the first
decision that must be made when designing a shared memory system. It is
much more trivial than the coherence strategies, but still is important.
The different approaches to page ownership were interesting. The fixed
centralized manager approach is simple to understand and implement (and
appeals to me because of its simplicity), but quickly would become a
bottleneck. Even the modified central manager, with its decrease in
messaging, would still become a bottleneck. The distributed manager
algorithms presented would seem to perform better. The broadcast
distributed manager is once again clear and easy to understand, but
would not scale. The dynamic distributed manager using hints of probable
owners is a very cool idea. It reminded me of Emerald’s approach to
finding an object. When trying to decide after how many faults to
broadcast the owner (M), I think being able to perform a real-time
analysis and tune M would be useful and possible. The variables involved
would be the average number of messages needed to find an owner, the
frequency of faults, and the overhead of the broadcast. If your system
is faulting a lot, then even two messages to find the owner might be too
many. If your system faults infrequently, then the overhead of
broadcasting might be larger than the overall performance impact of
searching for the owner. I would think the system could be monitored and
this adjusted real-time instead of using a fixed value (i.e. N/4).
It really seems to me that all of the algorithms would run into issues
of scaling. Even with the purpose being to effect memory sharing and not
expand the address space, I’m not sure how significant it is to consider
large scale sharing. At some point, it has got to be more efficient to
use a message passing or RPC approach and partition the problem by hand.
So I’m not sure how valid criticisms of scaling are. Does it really make
sense to spread a 4 GB address space over 2000 machines.
Of the standard distributed system issues, this paper/approach deals
successfully with binding (locating page owner/contents) and to some
degree performance. Marshalling, object-oriented language support,
callbacks and threading models are all irrelevant because they are
private address space issues that become moot points in a shared address
space. Failures and heterogeneity are not really discussed and it isn’t
clear how they would be addressed at all. The dynamic distributed
manager approaches would seem to be the hardest to resolve failures in.
The centralized manager would be the easiest. Heterogeneity is simply an
unfair goal for this concept – I believe a heterogeneous environment
might be possible but would probably need another layer of abstraction
(i.e. Mach VM) and would probably negatively impact performance.
This archive was generated by hypermail 2.1.6 : Tue Feb 17 2004 - 17:14:01 PST