From: David V. Winkler (dwinkler_at_windows.microsoft.com)
Date: Wed Feb 18 2004 - 16:44:18 PST
Review: Memory Coherence in Shared Virtual Memory Systems
This problem presents a solution to the loosely coupled paralell
processor problem. A loosely-coupled multiprocessor has the singular
feature of having the physical memory distributed. So this is a
distributed memory solution for a network of machines.
The author states that 'a processor is allowed to update a piece of data
only while no other processor is updating or reading it.' While this
will satisfy the goal of memory coherence, all of the normal problems of
distributed system timing will still apply. (e.g. the interleaving 1read
2read 1write 2write)
The strategies described use the page fault handler of the standard
virtual memory mechanisms to modularize their code. This seems to deal
with many of the problems of making the system testable and
implementable.
The paper delves into a number of non-optimal strategies. This seems
odd, but they then include these in their benchmarking, and the
bottlenecks really do show as explained. Along the way they have some
convincing proofs of correctness and a scattering of lemmas and
theorems.
The strategy that they propose in the end is a dynamic distributed
manager algorithm. Each node maintains a probable owner of each page,
not necessarily correct all the time. If a page fault occurs for a page
on another node, the probable owner is queried. If it actually isn't
the owner, it forwards the request on to the node that it thinks is the
owner and so on. The proof of acyclicy is important here. A further
enhancement of distributing the copy_set (necessary on an invalidate as
part of a write fault) data into a tree. This doesn't increase the
number of messages.
The benchmark numbers are cool.
This archive was generated by hypermail 2.1.6 : Wed Feb 18 2004 - 16:44:34 PST