(no subject)

From: Gail Rahn (grahn_at_cs.washington.edu)
Date: Mon Feb 23 2004 - 18:45:06 PST

  • Next message: Sellakumaran: "Review: Scale and Performance in a Distributed File System"

    Review of "Scale and Performance in a Distributed File System" by Howard et
    al

    This paper describes the performance considerations for Andrew, a
    distributed file system used on the campus of CMU. Andrew will eventually
    span up to 10,000 nodes. The paper describes the prototype implementation of
    Andrew and the design improvements made obvious from scaling and perfomance
    observations.

    Andrew is comprised of Venus and Vice components. All system components run
    the Berkeley BSD 4.2 version of UNIX. Vice is (collectively) a set of
    trusted servers that provide a homogeneous, location-transparent file
    namespace to client workstations. Venus is a user-level client process
    running on workstations that access Andrew. When a remote file is opened,
    Venus caches the entire file received from Vice. When the file is closed,
    Venus stores the modified version of the file back on the Vice server. All
    file modifications are performed on a cached version of the file on the
    local machine and do not involve Venus.

    Each Vice server had a directory structure that mirrored the structure of
    Vice files stored on it. File status information was kept in shadow ".admin"
    directories. For files kept on other servers, the directory entries were
    present and ended in "stub directories" that identified the server
    containing the file. Venus queried Vice for filenames based on their full
    path (Corresponding to directories and terminating filename), the prototype
    implementation did not implement any low-level filenaming. Files
    locally-cached by Venus were always suspect, and when opening files, Venus
    checked to make sure that the file timestamp was indeed the latest version
    of the file.

    The creators of Andrew attempted to design for large scale. Evaluation of
    the prototype design resulted in the following observations:
            * Vice architecture of one dedicated process per client used too
    many system resources
            * Embedding locations of remote servers in a Vice directory tree
    made home directory migration painful.
            * Storage quotas could not be enforced for users.
            * Servers performed adEquately when limited to less than 20
    concurrent client connections.

    The prototype implementation also uncovered that the vast majority of
    client/server interaction was testing user authentication for a file or
    getting file status. Actual fetch and store operations accounted for only 6%
    of client/server traffic, of that, 2% was storing a new version of a file.
    The most frequently-used operation, TestAuth, was shown to significantly
    degrade in performance in a load of more than 5 concurrent clients. Further,
    context-switching between the server processes dedicated to clients was
    costly in terms of CPU usage.

    The next version of Andrew changed in several ways. Venus, the client
    process, additionally caches directory contents and symbolic links. File
    status information is cached in virtual memory, to ensure that FileStat
    requests are services rapidly. Substituting for Venus's constant
    cache-verification checks for a file, instead Venus and Vice communicate via
    callbacks. Venus registers callbacks with the Vice server for a file or
    directory. Vice notifies registered Venus clients before modifying the file
    or directory. Tihs dramamtically reduces client/server traffic. File
    name-resolution was improved to used a fixed-length unique identifier (a
    fid) rather than the full pathname. Vice servers are unaware of pathnames
    and work with only fids. Vice was rewritten to used thread-pooling rather
    than one server process per client. And, inodes were reintroduced as a
    storage mechanism, rather than pathnames. These improvements significantly
    improved performance of the file system under substantial load.

    -- Gail.

    -------------
    Gail Rahn
    grahn_at_cs.washington.edu


  • Next message: Sellakumaran: "Review: Scale and Performance in a Distributed File System"

    This archive was generated by hypermail 2.1.6 : Mon Feb 23 2004 - 18:45:17 PST