Jim Shearers review of Cluster-Based Scalable Network Services

From: shearerje_at_comcast.net
Date: Wed Feb 25 2004 - 13:00:17 PST

  • Next message: Sellakumaran Kanagarathnam: "Review: Cluster-based Scalable Network Services"

    Cluster-Based Scalable Network Services describes an extensible processing architecture that exploits the characteristics of the internet service environment: Large numbers of independent clients performing relatively small tasks, where the tasks themselves can be characterized into a limited number of service-specific “types”. This is quite different from the kinds of environments that I am used to programming for, and I found some of the philosophical tradeoffs to be enlightening.

    Parallelism: Since the tasks are small and independent, the question is not how to break a task across multiple processors, but rather how to allocate a task to a processor. The heart of the proposed system is a System Area Network (SAN) and a manager singleton that together map “front ends” (task receivers) to “workers” like a giant switch. SAN-facing stubs on both the front ends and the workers provide interfaces needed by SAN to perform the switching. The ability to characterize tasks into a limited number of types means that only a limited number of flavors of stubs need be created.

    Overflow Pool: I tend to think in terms of dedicated machines. The paper introduced the concept that machines not primarily intended for use in the service cluster could be commandeered to handle burst loads. The paper cited the example of using office automation machines from other parts of the building for this purpose.

    Data Integrity: As discussed in class, one normally expects a system to display sequential consistency. Related to that, one normally designs to avoid operating on stale data or generating approximate answers. The paper identifies this expectation as Atomicity, Consistency, Isolation and Durability (ACID) and proposes that an alternative expectation, Basically Available, Soft state, Eventual consistency (BASE), is sufficient for satisfying user requests for the target services.

    Fault Tolerance: I have worked on numerous fault tolerant systems, including one that was Byzantine fault tolerant at the instruction level (right down to the voted clock line), but I have never worked on a system that was as temporally relaxed as this one. I’m not disparaging it; I just don’t understand how it works. I don’t understand what constitutes “soft state” and would like to know more about how it is refreshed by the peers. I also would like to know how, when the peers detect that a module has failed, the system avoids more than one peer restarting more than one instance of the module.


  • Next message: Sellakumaran Kanagarathnam: "Review: Cluster-based Scalable Network Services"

    This archive was generated by hypermail 2.1.6 : Wed Feb 25 2004 - 13:00:23 PST