Review: Log-structured fle system

From: Cem Paya (cemp_at_microsoft.com)
Date: Mon Feb 23 2004 - 17:52:45 PST

  • Next message: Cliff Schmidt: "Rosenblum and Ousterhout "Design and Implementation of a Log-Structured File System""

     

    Review: Log-structured file system

    CSE 551P, Cem Paya

     

    In this paper Rosenblum and Ousterhout describe a new approach to
    implementing file system. Called "LFS" for log-structed file system its
    main distinction from a traditional file system is the allocation of
    blocks on disk to files. Requirements for LFS originate with 3 good
    observations: CPU speeds are subject to Moore's law which the authors
    forecast will continue to hold in short term (a safe bet when the paper
    appeared in 1992 and their predictions were proved correct), memory size
    continues to expand though not at same rate as Moore's law-also true in
    the dozen years since publication-but disk speeds remain the bottle neck
    because of physical movement involved. Disk I/O is constrained by how
    fast the disk is spinning and how quickly the disk head can be
    positioned on top of a particular track. Of these the first is by far is
    easier and current drives can push the 10K RPM envelope, but precisely
    positioning the read head on top of a track still requires on the order
    of milliseconds latency.

     

    LFS is optimized for the scenario when the access pattern of an
    application is dominated by the cost of disk writes. Basic idea is to
    depend on large amounts of memory to buffer consecutive write operations
    into a single group and write them all at once into a linear structure.
    When a file is deleted or partially overwritten, these changes are not
    written to the same place on disk where the files current blocks exists.
    Instead they are written into the next available position in the log.
    This is very different from traditional UNIX FFS where inodes keep track
    of block allocations for files and subsequent write operation have to
    locate that block on disk. LFS still requires a meta-map of inodes since
    they are written contiguously with data blocks. Reading files involves
    chasing pointers to locate the individual pieces of a file-this is
    similar to standard file systems, and indeed the read performance is
    comparable, or slightly worse because of increased fragmentation.

     

    Because overwrites can render sections of the log irrelevant, there is a
    process in place for freeing up disk space when number of free segments
    is too low. It employs two stratetegies. First one is very similar to
    the way a copying garbage collector operatates: conceptually there are
    two heaps, only one of them in use. The collector identifies live
    objects and moves them to the second heap, and swaps the heaps when
    completed. For LFS the files are "compacted" by moving them to the
    beginning of the disk. Second strategy is threading which avoids the
    cost of copying files but leaves files fragmented, which increases seek
    time to gather individual pieces of one large file. The paper also
    defines a metric called "write cost" for measuring the efficiency of
    disk I/O, relative to the raw bandwidth of the device. Under simulated
    loads, the prototype implementation called Sprite LSF significantly
    outperforms the standard SunOS file system in micro-benchmarks,
    measuring small file operations. There were one questionable predication
    about linear scaling of performance in the small file benchmark
    (creating 1K files) with number of processors. It is true that the
    standard file system keeps disk busy while using a very small fraction
    of its available bandwidth while Sprite LFS saturates the CPU with low
    disk utilization. On the other hand coordinating I/O between processors
    would likely reduce bandwidth.

     

     


  • Next message: Cliff Schmidt: "Rosenblum and Ousterhout "Design and Implementation of a Log-Structured File System""

    This archive was generated by hypermail 2.1.6 : Mon Feb 23 2004 - 17:53:01 PST