Procupine review

From: Praveen Rao (psrao_at_windows.microsoft.com)
Date: Sun Mar 07 2004 - 13:21:03 PST

  • Next message: Richard Jackson: "Review: Saito, et al. Manageability, Availability and Performance in Porcupine: A Highly Scalable Internet Mail Service"

    Porcupine - scalable mail service

    System requirements:
    1) Manageability: system must self-configure itself and self-heal wrt
    failure and recovery
    2) Availability: Despite failure, system should provide good service to
    all users at all times
    3) Performance: single node perf should be competitive with other
    single-node systems

    Key design principle: functional homogeneity

    1)Every transaction is dynamically scheduled to ensure that work is
    uniformly distributed.
    2)System auto-configures itself as nodes are added/removed
    3)System and user data are automatically replicated across a number of
    nodes to ensure availability

    Hard-state (Can't be lost)
    Soft-state (can be recreated)

    Hard-state is replicated but soft-state is usually on a single node

    Key data-structures:
    *Mailbox fragment - unit of replication, mailbox fragments are
    replicated, they are hard state
    *Mail map - list of nodes containing mailbox fragments for a given user
    *User profile database - username, password etc.
    *User profile soft state - subset of user-profile database, replicated
    on each node
    *user map - hash value of username to a node that is currently
    responsible for user profile's soft state and mail

    map, This is a soft state and is replicated across all nodes
    *cluster membership list - set of nodes that are part of porcupine,
    maintained on all nodes

    Authors argue the following advantages and tradeoffs:
    *any node can store mail for any user, a user's mail is replicated on
    arbitrary set of nodes, even if a user

    manager goes down, another will take over as the user manager for that
    user
    *extremely fault-tolerant
    *Auto-configurable, due to homogeneity
    The trade-off is distribution of a users mail on multiple nodes
    complicating storage and retrieval (spread)

    This approach seems to be a fine grained distribution to me, which
    increases both performance and manageability at the cost of complexity
    of the system.


  • Next message: Richard Jackson: "Review: Saito, et al. Manageability, Availability and Performance in Porcupine: A Highly Scalable Internet Mail Service"

    This archive was generated by hypermail 2.1.6 : Sun Mar 07 2004 - 13:21:11 PST