From: Brian Milnes (brianmilnes_at_qwest.net)
Date: Wed Feb 25 2004 - 16:02:03 PST
Cluster Based Scalable Network Services - Gribble et al
The authors propose using clusters of workstations to provide incrementally
scalable, highly available low cost internet services. They provide a layer
of software that allows the management, transformation, aggregation, caching
and clustering of internet services.
Clusters are highly available and built inexpensively from commodity
building blocks but suffer from administration problems, shared state
problems and many services must be broken down to a smaller scale and
reconstructed. These services can be built with less that ACID semantics
called BASE which is basically available, soft state and eventual
consistency. They split their clusters into management, front ends, cache
and workers. They provide a centralized load balancer which passes out hints
to the front end.
They use a process peer fault tolerance in which peers restart a failed
process. A layer is provided to handle load balancing and fault tolerance
from worker applications. The load balancing manager is quite complicated
using lottery scheduling and overflow management. They cached web data with
Harvest and fixed its thundering herd startup problems.
They then review the hotbot architecture. It's easy enough to move a
datacenter without this level of reliabililty, I've done it three times
live. Their self tuning algorithm is nice but a bit complicated. I simply
stopped service above a queue length at the search engine and load balanced
by random weights multiplied by a running average service time. Their
dynamic addition of workstations on load spikes is very nice. When the
Internet went down in frequently at a POP in NJ I would see huge spikes and
had to hustle quite a few times to manually configure our spider boxes into
service. This is certainly a much better way to build these services than I
could provide at Lycos due to the monolithic nature of our search engine,
but frankly it seems a bit overly complicated.
This archive was generated by hypermail 2.1.6 : Wed Feb 25 2004 - 16:02:11 PST