From: Honghai Liu (liu789_at_hotmail.com)
Date: Wed Feb 18 2004 - 15:37:59 PST
Memory Coherence in Shared Virtual Memory Systems.
Reviewer: Honghai Liu
The paper talks about the shared virtual memory in distributed system and presents several algorithms
to solve memory coherence problem.
The idea of shared virtual memory is not only interesting, but also important in terms of providing an
easy and natural way to think of process migration among distributed and multi-processors environment.
There are several different approaches in achieving process migration. RPC is offers procedure style
support to programmer, minimizing the details of server-end processes. Emerald, on the other hand,
takes the advantage of type safe languages in order to offer performance benefits and object-oriented
features in a distributed system. The shared virtual memory, unlike RPC or Emerald, don't reply on any
languages or compilers support, and provides the lowest level of process migration facilities, i.e. instead
of moving processes, moving the substrate the processes resides on - virtual memory. Opal is probably
closest to this approach, but it is still at process or thread level.
In the end, all the different approaches are achieved physically by message passing, however, I think the
virtual memory approach is most primitive and interesting.
The problem the paper was trying to solve is that how to ensure the coherency in shared virtual memory
among processors. Many of the issues are similar to tight-coupled multi-processors. For example,
invalidation and writeback in page synchronization resemble those in local-cache and shared memory in
multiprocessors.
Two feasible solutions of solving coherence problem exist: centralized manager and distributed manager.
The difference between them is that centralized manager has a single place to processes requests for
finding the pages, while distributed one let each processor manage subsets of the pages. Even more, with
dynamic distributed manager, there will be fewer broadcasts in the network.
I found, especially, the centralized manager algorithm very weak, because it has one single point of failure.
If the manager dies or respond very slowly due to network failure, it will not work properly. In a distributed
system, unlike in a uni-procossor environment, we have to consider this risk.
This archive was generated by hypermail 2.1.6 : Wed Feb 18 2004 - 15:38:10 PST