The Design and Implementation of a Log-Structured File System

From: Greg Green (ggreen_at_cs.washington.edu)
Date: Mon Feb 23 2004 - 14:46:46 PST

  • Next message: Chuck Reeves: "The Design and Implementation of a Log-Structured File System"

    This paper describes a file system designed to optimize writes to
    disk, as opposed to general read/write performance. The rationale is
    that memories are so large, that most of the data needed can be read
    off of disk and cached in memory. Therefore, if you ensure that writes
    are done as fast as possible, the best utilization of bandwidth
    between the disk and cpu is achieved.

    The term log-structured means that the new blocks are written out to
    the disk sequentially in a new free area, called a segment. The data
    blocks are written first, then the inodes, then, the directories. This
    is different than normal file systems which have the inodes in fixed
    positions on the disk. This requires a data structure that maps an
    inode number to the actual block on disk. A critical factor for
    performance is that there must be sufficient free contiguous free
    space on the disk or no performance gain will result. Files that are
    changed a lot will cause a lot of fragmentation as the old changed
    blocks are invalidated in various spots on the disk.

    The design criteria and experiments that went into finding the best
    garbage collector of segments was described next. This was found to be
    a critical factor in the performance of the system. The best policy
    was found to be a combination of the cost of cleaning the segment and
    benefit of cleaning the segment. This was achieved using a ratio of
    the free space generated * age of data over the utilization of the
    segment.

    The paper ended with a discussion of checkpointing and crash
    recovery. Because of the log structure of the file system, after a
    crash the system can roll-forward from the last checkpoint of data on
    the disk, reconstructing the disk structures as far as possible.

    I had heard of the term log-structured before, but had no idea really
    what it meant. It seems that most new file systems on linux have a
    log, but I understand that to mean just a meta-data log. So this
    appears to be quite different. One of the fundamental decisions was
    that losing data after a checkpoint is ok. I wonder whether that is
    really true. I can conceive of a lot of situations where this would be
    disastrous. On the other hand, I don't understand what a traditional
    file system would do during a crash, so maybe that is already taken
    into account.

    This area of filesystems seems to me to need a lot of work. The
    current ones work ok, but seem to have a lot of issues. The fact that
    the bus is only 5% utilized is pretty bad. What is going to happen
    when machines have 1TB disks? That day isn't too far away it seems.

    -- 
    Greg Green
    

  • Next message: Chuck Reeves: "The Design and Implementation of a Log-Structured File System"

    This archive was generated by hypermail 2.1.6 : Mon Feb 23 2004 - 14:47:57 PST