Log-structured file system review

From: David Coleman (dcoleman_at_cs.washington.edu)
Date: Sun Feb 22 2004 - 20:32:20 PST

  • Next message: Steve Arnold: "Review: Rosenblum & Ousterhout, LFS"

    Log-structured file systems was an interesting look at a non-traditional
    file system. I liked how it drew from the experiences of database and
    other systems that use logs but applied these experiences in a unique
    fashion. My experience with file systems is for optical media.
    Rewritable optical media write speeds are significantly slower than read
    speeds (generally at least a factor of two) and seek times are horrible
    so any approach that speeds up writing would be significant.
    Interestingly, write-once media is essentially a log-based system where
    segments are not reused.

    Segment cleaning seems like it would consume significant overhead when
    rewriting the i-nodes when the location of data blocks changes due to
    relocation. One solution is to use an indirection for all data block
    locations (or any referenced address). The UDF (Universal Disk Format)
    file system uses an indirection for addresses via a Virtual Address
    Table (VAT). When moving the data block, only the VAT is updated. This
    approach does incur a random-access penalty with an additional table
    lookup to discover the physical block address which would impact read
    performance. The ratio of cleaning to reading would dictate the
    usefulness of this strategy.

    I think a modification that might speed up segment cleaning would be to
    have the following states recorded in the segment summary/i-node map:
    - deleted / truncated to zero length – data dead, no need to check i-node
    - modified – unsure of data state, need to check i-node
    - unchanged since i-node/data written – data live, no need to check i-node
    Currently only deleted/truncated to zero length states are maintained by
    the system.

    Recovering from system crashes is an important design consideration. I’m
    not sure that log-based file systems need this more than traditional
    file systems given that a common optimization is to use larger write
    caches and order writes for optimal performance. Checkpoints only
    address a single type of failure: system crash leaving the file system
    in an inconsistent state. The system described would be significantly
    less forgiving for disk hardware failure – even single sector failures.
    These types of failures are usually much less pleasant to deal with. One
    thing the paper doesn’t specifically address (or I missed it!) is how
    the system determines that the file system is in an inconsistent state.
    Is there some sort of integrity sector/block/descriptor indicating that
    the file system is open or closed?

    Ironically, a general push today in optical media file systems, and an
    optimization that has been used by some implementations for years, is to
    cluster meta-data together. This both allows significantly better read
    performance and better strategies for error recovery. Generally, when
    reading in a directory, the first thing the operating system does is get
    the attributes of and information about all the files in the directory.
    This involves reading in every i-node. As such, clustering meta-data
    such as i-nodes together allows very good read performance when opening
    directories. As any user will tell you, waiting 30 seconds to see the
    contents of a directory in Windows Explorer is maddening.


  • Next message: Steve Arnold: "Review: Rosenblum & Ousterhout, LFS"

    This archive was generated by hypermail 2.1.6 : Sun Feb 22 2004 - 20:32:26 PST