The Design and Implementation of a Log-Structured File System

From: Chuck Reeves (creeves_at_windows.microsoft.com)
Date: Mon Feb 23 2004 - 15:08:11 PST

  • Next message: Nathan Dire: "Review of Log-Structure File System"

    The paper, "The Design and Implementation of a Log-Structured File
    System" was written in 1992 by Roseblum and Ousterhout at the University
    of California at Berkeley. It describes the design and implementation of
    a permanent log-structured file system named "Sprite". The approach
    anticipates future increases in the size of main memory caches and
    predicts that disk traffic will be dominated by writes. As a result, the
    design focuses on optimizing the write performance. Also, the systems
    focuses on the efficiency of small file accesses.

    Buffering of information yet to be written to disk is used to improve
    the efficiency of writing. It surprised me how willing the authors were
    to tolerate lost data on system crashes. The log structure divides the
    disk into fixed-size segments. These segments are the region of reuse.
    The segments are threaded together to identify consecutive blocks of a
    file.
    Segments are recovered through a process called cleaning. Segment
    specific utilization is tracked through the segment summary block
    residing on a part of each segment. The decisions made on how to
    reorganize the data on cleaned segments is called the cleaning policy
    and is (along with the writing of new data) the primary indicator of
    performance.

    The cleaning algorithm design aspires to reach a point where the
    segments are all nearly full or empty. Much of the document is dedicated
    to optimizing this behavior. Segment dimenstions include: % utilization
    and the last time modified. These statistics are eventually tracked in
    the segment usage table which allows the cleaner to identify the best
    candidates for cleaning quickly.

    I probably didn't understand the discussion on checkpoint regions and
    the roll-forward mechanism. But it seemed like it was describing a crash
    recovery mechanism that allowed unfinished write activity to be commited
    subsequent to the each crash.

    I thought the contrast between temporal and logical locality was
    paticularly insightful and would appreciate more discussion on this
    topic.

    Additionally, I'd like to understand how closely this behavior emulates
    that of databases.

    Chuck Reeves, creeves_at_microsoft.com
    Microsoft | Windows | Directory Services


  • Next message: Nathan Dire: "Review of Log-Structure File System"

    This archive was generated by hypermail 2.1.6 : Mon Feb 23 2004 - 15:08:25 PST