From: Joanna Muench (joannam_at_spro.net)
Date: Tue Feb 24 2004 - 21:54:51 PST
In Fox et al. (1997) the authors present an architecture for building
cluster-based scalable network services along with examples of two
implementations. Their design is centered on three core requirements,
scalability, availability and cost-effectiveness. Most of the discussion
is focused on the first two requirements.
This paper is far too detailed to make a summary worthwhile (without
merely paraphrasing the abstract). However it contains some interesting
concepts deserving of note. The first is splitting the networks services
into two groups, one which must always be fully consistent such as the
user profile database, and one where stale data is better than late
data, such as search query results. To handle the first, the authors
propose using ACID semantics (atomicity, consistency, isolation,
durability), while the second uses BASE (basically available, soft
state, eventual consistency) (which should really be called BASSEC). The
complexity of high-demand services has carried us far from Dijkstra's
desire to prove correctness.
Another interesting concept is the implementation of the load balancing
manager in TranSend. Not only do we get to see a use of lottery
scheduling, but also a further example where stale data is better than
none. In this case it is the manager stubs that cache the aggregated and
averaged load information from the centralized manager. While the
caching enables the system to continue after failure of the centralized
manager, it did introduce some peculiar oscillations in queue lengths
that required tuning of the centralized manager.
The final point that I find particularly interesting is the ability of
TranSend to tune itself by spawning new services (distillers in this
case). This design enables the system to balance intermittent bursts in
demand. In addition, the architecture includes an overflow pool for
absorbing more prolonged bursts. The machines in this pool don't need to
have any worker services running on them, but the manager can spawn
workers to these machines as needed. Very nice, unless it is your
desktop machine that is co-opted!
Overall an excellent paper covering a wide range of issues raised by the
need for large-scale network services.
This archive was generated by hypermail 2.1.6 : Tue Feb 24 2004 - 21:51:11 PST