Review of "The Design and Implementation of a Log-Structured File System"

From: Song Xue (songxue_at_microsoft.com)
Date: Mon Feb 23 2004 - 17:52:51 PST

  • Next message: Cem Paya: "Review: Log-structured fle system"

    The paper titled "The Design and Implementation of a Log-Structured File
    System" presents an implementation of log-structured file system whose
    performance is an order of magnitude better than that of a traditional
    file system. Log-structured file systems are based on the assumption
    that disk traffic is dominated by writes. This is because files are
    cached in main memory and that increasing memory sizes will make the
    caches more and more effective and satisfying read requests. A
    log-structured file system writes all new information to disk in a
    sequential structure called the log. This approach increases write
    performance dramatically by eliminating almost all seeks. In addition,
    the sequential nature of the log also permits much faster crash recovery
    as the most recent portion of the log needs to be examined.

    The Traditional file systems have two general problems. Disks tend to
    be accessed in a way that causes too many small accesses. And file
    writes tend to be synchronous. Log-structured file system improves
    write performance by buffering a sequence of file system changes in the
    file cache and then writing all the changes to disk sequentially in a
    single disk write operation in asynchronous manner.

    From the implementation stand point, the most difficult issue is to
    manage the free space on disk so that large extents of free space are
    always available for writing new data. Both threading and copying are
    used. The disk is divided in to large fixed-sized extents called
    segments. Any given segment is always written sequentially from its
    beginning to its end, and all live data must be copied out of a segment
    before the segment can be rewritten. The log is threaded on a
    segment-by-segment basis. A data structure called "segment summary
    block" is used to make segment cleaning possible.

    As to system recovery, both checkpoints and roll-forward are used to
    speed recovery process and minimize the loss of information. A
    checkpoint is a position in the log at which all of the file system
    structures are consistent and complete. The simple approach after a
    crash is to discard all changes after the newest checkpoint. However,
    this may result in too much changes lost. A roll-forward algorithm is
    used to recover as much changes that for sure have made to disk, while
    maintaining the consistency of the file system

     


  • Next message: Cem Paya: "Review: Log-structured fle system"

    This archive was generated by hypermail 2.1.6 : Mon Feb 23 2004 - 17:52:58 PST