Review of The design & Implementation of Log-Structured File System (by Rosenblum & Ousterhout)

From: Prasanna Kumar Jayapal (prasak_at_winse.microsoft.com)
Date: Mon Feb 23 2004 - 17:26:31 PST

  • Next message: Gail Rahn: "Review of "...Log-Structured File System" by Rosenblum and Ousterhout"

    This paper ("The design and Implementation of a Log-structured File
    System") presents a file system (LFS) that writes files sequentially in
    logs. The main advantages of LFS are better utilization of
    disk-bandwidth and better crash-recovery.

     

    LFS caches several write requests in the main memory and writes all
    relevant files in a contiguous area of the disk in only one write
    operation, eliminating thus the seek time delay of random-access file
    systems. This sequential log contains both the actual file-data as well
    as meta-data (i-nodes). There is an i-node map that holds the physical
    address of each i-node. I-nodes maps are compact enough to be fit into
    the main memory, hence location of any given i-node can usually be found
    out without costly disk access.

     

    The disk is partitioned into large, fixed-size segments. Logs are always
    written into empty segment. Fragmented data in a given segment is copied
    and compacted to other segment for the purpose of creating free
    segments. With every segment a bookkeeping data-structure is kept which
    is called 'segment summary block'. From the segment summary block we can
    find out whether a given block is live inside the segment; and we can
    update the i-node corresponding to a live block when we relocate it to
    another segment.

     

    Based on simulation results, the designers of LFS found out an efficient
    policy to select the segments eligible for cleaning procedure. It turned
    out that segments having higher value of cost/benefit ratio should be
    picked for cleaning. This policy reduced the cost of writing by more
    than 50% compared to the greedy policy (picking up most fragmented
    segment for cleaning), and also made LFS outperform UNIX FFS at even
    high disk capacity utilization.

     

    Another important aspect of the system is crash recovery facilitated
    through the periodical store of checkpoints, which correspond to a
    consistent state of the LFS. After a crash the LFS chooses the most
    recent of the two checkpoints hold at any time and incorporates the
    changes done since then by using a roll-forward mechanism that checks
    the last writes (which are located at the end of the log) and updates
    all relevant data-structures (e.g. utilization field in the segment
    usage table). Moreover, in order to ensure consistency between i-nodes
    and directory entries, all directory manipulations are written to a
    directory operation log before the corresponding directory entry or
    i-node.

     

    The authors present a number of arguments supporting the two major
    decisions they made. The first one is the log-based structure which, as
    performance results have shown, gives much better write performance for
    small files (because it writes many of them sequentially in a single
    operation) but yields the same or worse write performance for large
    files as well as read performance for random reads (if files are read in
    the order they were written, then the LFS's performance is much better).
    Their second major decision, referring to which segments should be
    cleaned, resulted from a simulation, which showed that the system should
    take into account not only the cost of cleaning but also the duration
    the cleaned segments are expected to stay used, evoking thus cleaning of
    rarely used segments at higher utilization than in the case of
    frequently used segments (which will rapidly re-accumulate empty space).

     

    Overall, I felt that the idea was new and interesting at the file system
    level although I could see quite some resemblances to the database
    systems. The authors have successfully demonstrated the benefits of the
    log based file systems.

     

     


  • Next message: Gail Rahn: "Review of "...Log-Structured File System" by Rosenblum and Ousterhout"

    This archive was generated by hypermail 2.1.6 : Mon Feb 23 2004 - 17:25:13 PST