Review of "Cluster-Based Scalable Network Services"

From: Jeff Duzak (jduzak_at_exchange.microsoft.com)
Date: Wed Feb 25 2004 - 11:10:25 PST

  • Next message: Honghai Liu: "Review on "Cluster-Based Scalable Network Services""

    This paper describes a reusable architecture for cluster-based services.
    In particular, the architecture is useful for services which are
    extremely parallel and do not require strict coherence. The
    architecture trades strict semantics for availability, performance, and
    scalability. The primary example that is given, TranSend, is a web
    proxy which reduces the size of returned web pages primarily by reducing
    picture quality.
     
    Clusters in general have a number of useful characteristics. First, a
    cluster is much cheaper than a single machine that could be capable of
    doing the same work. A cluster can be composed of commodity machines
    purchased from any vendor. Second, a cluster naturally lends itself to
    reliability. If a single machine in the cluster goes down, there is no
    reason why the entire cluster must go down. Third, a cluster lends
    itself to easy upgrade. When demand on the cluster grows, new machines
    can be added to the cluster.
     
    By abandoning strict ACID semantics, the authors increase the
    parallelism of the system. For instance, load balancing decisions are
    made based on information that is not always up to date. Load balancing
    information is gathered periodically by a manager component from all the
    worker components, and then broadcast to the manager stubs on the front
    end components. Each manager stub then uses this information to decide
    where to forward a request. Hence, the load balancing information is
    not coherent. Load may change, and load balancing decisions may be made
    with old data. However, the information is good enough to achieve the
    desired load balancing. Further, the system allows manager stubs to
    work in parallel, not having to contact a central manager for each
    request. However, ACID semantics are still maintained for parts of the
    system that need it, such as user profiles.
     
    The authors then describe the TranSend system which was built using this
    architecture. The TranSend system demonstrates that the architecture
    does achieve the goals of scalability and availability, and also shows
    that the architecture is easily reusable. Beyond the implementation of
    the base SNS (scalable network service) components, the work to build
    the TranSend system included building the worker nodes and creating the
    front end. The worker nodes were picture and HTML distillers built
    mostly from existing code. The front end consisted of the logic for
    passing data to workers and returning it to the client. The TranSend
    system was then tested, and found to scale linearly all the way up to
    the maximum load supported by the equipment available to the authors.
     
    A second system, HotBot, was described. The discussion of this system
    seemed a bit out of place, as this system did not use the architecture
    described by the paper. Further, the HotBot system did not use dynamic
    load balancing, a key feature of the SNS architecture. It seems that
    the HotBot system was described primarily to further the argument that
    BASE semantics, and clustering in general, allow for good scalability
    and availability.
     
    The main idea of the paper, that is, trading strict semantics for better
    scalability and availability, seems a very good idea for some systems.
    Another interesting theme of the paper was that approximate answers are
    sometimes good enough, and the related idea that a system need not try
    to be correct in all cases. An example of that idea in this paper was
    the Bay Area Culture page, in which incorrect data may be reported, but
    it is assumed that the user can just ignore that data.


  • Next message: Honghai Liu: "Review on "Cluster-Based Scalable Network Services""

    This archive was generated by hypermail 2.1.6 : Wed Feb 25 2004 - 11:10:31 PST