From: Sellakumaran Kanagarathnam (sellak_at_windows.microsoft.com)
Date: Wed Feb 25 2004 - 13:45:45 PST
This paper identifies three basic challenges in a scalable network
service and presents a layered architecture for cluster based scalable
network services and a service-programming model. The authors discuss
two real network services: HotBot and TranSend and provide detailed
measurements for TranSend.
The three fundamental challenges in network services are:
a) Scalability: when the load increases, the service should scale up (by
adding incremental hardware) to provide the same level of service
b) Availability: service should be available 24x7
c) Cost effectiveness: economical to administer and expand
Clusters of workstations meet the above challenges or requirements. The
authors point out that for most of the internet services (with the
exceptions of transactions/billing and others), high availability is
more important than consistency. They introduce a term called
BASE-Basically available, soft state and eventual consistency. The
paper focuses on services that manipulate primarily BASE data. One of
the key ideas of cluster-based scalable service architecture is to
divide the system into components. The authors propose the following
components as part of their SNS:
a) Front Ends - Interface to SNS for outside world (for e.g. HTTP
server)
b) Worker Pool - Caches and server specific modules that implement the
actual service
c) The customization database - stores user profiles that allow mass
customization of request processing
d) The manager - Balances load across workers and spawns additional
workers on need basis
e) The graphical Monitor - for system management and health monitoring
f) The System area network
The above components are grouped into three layers: SNS, TACC
(transformation, aggregation, caching, and customization) and Service.
The SNS layer provided scalability, load balancing, fault tolerance and
high availability and it comprises the front ends, manager, SAN and
monitor. TACC is a programming model for internet services.
The authors then discuss a service implementation by describing
TranSend, a scalable Web distillation proxy and compare it with HotBot.
It was a hard paper to read, but the layered reusable architecture
addressed many of the top issues like, self tuning, load balancing, and
health monitor. It also introduced a new data semantics-BASE. It was
interesting to read on the monitor and self tuning. It was not very
clear how an actual client is load balanced to a Front end and looks
like it is outside the system. On the fault tolerance front, I am not
able to understand how requests sent to failed workers are not impacting
performance or high availability.
This archive was generated by hypermail 2.1.6 : Wed Feb 25 2004 - 13:45:51 PST