From: Greg Green (ggreen_at_cs.washington.edu)
Date: Tue Feb 24 2004 - 21:52:04 PST
This paper argues that some internet services are best implemented by
a commodity computer cluster to provide scaling, low-cost administration,
and high-availability. They start with the premise that a lot of
internet services don't require ACID semantics, but instead can use
BASE semantics. BASE stands for basically available, soft state,
eventual consistency. A 3 layer architecture is proposed. The lowest
latery is called SNS, for scalable network support. This provides
scaling, load-balancing, and overflow management, front-end
availablity, system logging and monitoring. Front-ends are contacted
by clients on the internet. The frontends distribute the load based on
hints given by a manager process that receives information from the
servers and distributes the load using lottery scheduling. The layer
also provides a graphical monitor showing the clusters state.
The next layer is called TACC, or transformation, aggregation,
caching, and customization. This takes the stream of data coming to
and from the workers and transforms, caches, or changes it based on
user preferences. The user preferences are stored in a database that
uses ACID semantics. These services can be turned on or off, replaced,
while the system is running without a significantly detrimental
effect. The final layer is the servic, this is the actual service
requested by the clients, ie web-servers, or other services that can
use BASE semantics. These services do not need to worry about the
underlying layers and can concentrate on providing the service.
Two examples are covered, TranSend which is a frontend to
the modem pool at Berkeley, and HotBot, which is a search engine. The
TACC layer on TranSend can compress images as specified by the user to
speed up data transfer over their phone lines. The manager of the SNS
layer watches the various elements and starts new elements if the load
requires. HotBot benefited from the automatic management of nodes in
the cluster and the BASE semantics to provide a high level of service.
There was a large section on analysis of network traffic and
load-balancing in the TACC layer for TranSend. Internet traffic was
shown to be "bursty" on all time scales, and is best buffered with
overflow services to give high throughput without excessive capacity.
I found the discussion of BASE vs ACID semantics was helpful. I had
always thought that ACID was required for "real" internet
services. The discussion of how each layer operated was also
illuminating. The trio of services watching each other, restarting it
if a failure was detected was novel. I would like to know how Google
handles these things. I read once that there were ~6000 nodes in
operation. Obviously they have all of the problems described in the
paper in spades, but seem to handle loads quite gracefully.
The layering concept was interesting. We learned earlier in the course
that layering is problematic and difficult to achieve in practice, and
maybe not even desirable. In this casee it looks quite good to
me.
-- Greg Green
This archive was generated by hypermail 2.1.6 : Tue Feb 24 2004 - 21:52:09 PST