Review: Cluster-based Scalable Network Services

From: Raz Mathias (razvanma_at_exchange.microsoft.com)
Date: Wed Feb 25 2004 - 17:38:33 PST

  • Next message: Justin Voskuhl: "Review for "Cluster-based Scalable Network Services""

    Today's paper introduced us to the concept of clustering. Unlike
    transaction processing systems, clusters do not guarantee the ACID
    properties. Some of these strong properties are compromised for the
    goal of increasing scalability, increasing uptime, and reducing costs.

     

    The paper takes a layering approach to developing cluster architectures.
    The lowest layer is called the Scalable Network Service (SNS) layer.
    This layer supports load balancing, overflow management, front-end
    availability, fault tolerance, and monitoring and logging. The middle
    layer implements Transformation, Aggregation, Caching, and Customization
    (TACC). Finally, at the highest layer we have the applications. Note
    that when implementing at this layer, we do not have to think about how
    scalability and fault tolerance are handled as they are abstracted from
    below.

     

    The paper defines a cute-sounding alternative to the ACID properties
    dubbed BASE, which stands for Basically Available, Soft state,
    Eventually consistent. The lack of guarantees in this system makes it
    viable under specific circumstances. These include times when the data
    is read-only, and no session information (i.e. state) needs to be
    maintained across calls to the worker processes. Search engines and
    proxy servers are ideally suited to this type of model. Introducing
    state to this system would be detrimental, as nodes would have prolonged
    dependencies on each other and the state could not easily be transformed
    into "soft state" (e.g. the state needs to be stored somewhere). Soft
    state is a new concept introduced by this paper; it can be considered to
    mean calculated state, or state that need not be stored. The fact that
    storage is not necessary for this state allows us to avoid the costs
    involved with replicating it and backing up data. Soft state trades off
    some latency in the case of failure for simplified management and better
    scalability (because there is no shared resource such as a database that
    stores state upon which processes idly block).

     

    The bottom layer in the system provides the fabric for the system's
    scalability. It replicates components for both fault tolerance and
    scalability. It manages load across machines and assigns tasks to an
    emergency worker pool in instances of high demand. An interesting facet
    of this layer is the fact that it depends on peer fault-tolerance. When
    a node is corrupted, it depends on the other nodes in the system to
    detect the failure (using a timeout or ping mechanism) and restart the
    faulting process. Worker stubs (on the worker nodes) and manager stubs
    (on the front-ends) allow the application programmer to essentially
    ignore the requirements of the high scalability; they merely worry about
    their own single-instance performance (in terms of latency). This
    abstraction allows us to build Internet-scalable applications from
    off-the-shelf components.

     

    It is interesting that this paper defines a scalable network service as
    requiring incremental scalability, overflow growth, and 24x7
    availability. I prefer the definition of a scalable system presented in
    the Porcupine paper, as it includes the critical property of management
    in the definition. The trend toward less state (and soft state) in each
    component is also interesting (and new?). This limits us in the kind of
    applications we can create and we may trade off a bit of performance in
    the case of failures (in smaller networks where there is no contention
    on the shared or hard state) but it does seem to scale better which is
    the goal for supporting applications that scale to Internet-sized
    populations of clients.

     


  • Next message: Justin Voskuhl: "Review for "Cluster-based Scalable Network Services""

    This archive was generated by hypermail 2.1.6 : Wed Feb 25 2004 - 17:38:37 PST