Review: Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service

From: David V. Winkler (dwinkler_at_windows.microsoft.com)
Date: Wed Feb 25 2004 - 09:48:45 PST

  • Next message: Ankur Rawat \(Excell Data Corporation\): "Review of the Porcupine mail system"

    Review: Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service

    Porcupine is a mail system. The exciting features are replication and dynamic load balancing. A user mailbox is spread into fragments. These fragments are scattered across the cluster, and each fragment is replicated onto several physical nodes. This means that even with any one (or multiple) node removed from the system the mailbox continues to be available. Even with unavailable fragments the user is not completely prevented from access to the mail system.

    This seems robust and seems like it can be made very very efficient.

    The article discusses soft state and hard state (mentioned but not really discussed in the Fox paper). Soft state is the information that can be calculated from hard state. Hard state is the information that cannot be recalculated and must therefore be replicated. Hard state includes the actual mail messages for a user. Soft state includes "the list of nodes containing mail for a particular user". Soft state is not replicated in Porcupine. This leads to greater consistency.

    The article makes a convincing argument that this type of structure is necessary for a service that requires lots of writing of hard state (unlike the Fox paper which was mostly about stateless reading).

    I believe the discussion of the cost of replication. The implementation seems exciting, but doesn't seem to have been proven with a huge population. (Grapevine looked good until distribution lists).


  • Next message: Ankur Rawat \(Excell Data Corporation\): "Review of the Porcupine mail system"

    This archive was generated by hypermail 2.1.6 : Wed Feb 25 2004 - 09:48:52 PST