From: Gail Rahn (gail_at_screaminggeek.com)
Date: Wed Feb 25 2004 - 07:58:37 PST
Review of "Cluster-Based Scalable Network Services" by Fox et al
This paper describes requirements for a scalable network service
application. The authors argue that the best hardware architecture is a
load-balanced cluster of commodity PCs (rather than SMPs) acting as
component workers of the service. The authors also introduce a software
architecture that provides the infrastructure for clustered, scalable
Internet services and market it as an off-the-shelf solution, leaving only
the service content programmable by the solution's consumer.
Reliable network services must meet the following three challenges:
* Scalability: the service must be able to be incrementally and
linearly support increased load by adding new hardware.
* Availability: the service must be available 24x7 despite hardware
or software failures.
* Cost-effectiveness: THe service must be economical to administer
and expand.
Commondity-PC based clusters seem to meet this challenge because the work of
Internet services is mostly parallel and fast (a few CPU-seconds). These
clusters are highly independent, each machine has its own hardware and
software, and easy to incrementally upgrade and etend. Further, commondity
PCs are quick to obtain and configure. Commodity-PC clusters can be
challenges for administration. Each component PC can be too weak to support
and entire service, so often the service must be divided into component
worker processes and spread across the cluster. Also, shared state is
diffcult to effectively pass between nodes of a cluster.
The authors refer to a strong test of transactional network services (ACID -
atomocity, consistency, isolation, durability) and suggest that while
appropriate for some services like e-xcommerce and user profiles, it is too
strong for most cluster-based network services. Instead, they instroduce
BASE (basically available, soft state, eventual consistency) as a metric for
high availability of data. BASE allows clusters to handle partial failures
with less complexity and cost that ACID services.
A systems architecture and programming model is proposed that separates the
content of network services from their systems implementation. Te
architecture has six components:
* Front Ends: the entry-point for service consumers, front-ends
shepherd requests through the rest of the system.
* Worker Pool: caches and service-specific modules that implement
the actual service.
* Customization Database: stores user profiles.
* Manager: balances load across the worker pool and adjusts the pool
based on the actual service load.
* Graphical Monitor: tracks the system's behavior for systems
management purposes.
* System-Area Network: High-bandwidth internal network used to
remove internal latency as system bottleneck.
Keys to the succesful implementation of a network service are:
scalability, centralized load balancing in the Manager, incremental growth,
the ability to sustain load bursts, soft state (cached state that can be
recalculated if necessary) and a narrow, specific interface to the worker
pool. A programming model is offered for Internet services, TACC
(Transformation, aggregation, caching and customization). A large number of
interesting services can be implemented at the TACC and service layers. The
network service layer need not be modified unless specific low-level
performance metrics must be met.
The rest of the paper looks at two services (TransSend and Hotbot)
that are built using this systems and programming model. The authors present
detailes measurements that support the effectiveness of this type of
architecture.
-------------
Gail Rahn
grahn_at_cs.washington.edu
This archive was generated by hypermail 2.1.6 : Wed Feb 25 2004 - 07:58:45 PST