Review: The Design and Implementation of a Log-Structured File System

From: Sellakumaran (ksella_at_hotmail.com)
Date: Sun Feb 22 2004 - 19:52:40 PST

  • Next message: David V. Winkler: "Review: The Design and Implementation of a Log-Structured File System"

    This paper presents a new technique for disk storage management: log-based
    file system. It starts with the problems faced by current systems like UNIX,
    proposes a log based solution, discusses the various components of the
    solution, and compares a prototype with Unix FFS. The paper focuses on
    issues and improving the write performance only and concludes that a
    log-structured file system can use disks an order of magnitude more
    efficiently than existing file systems.
    One of the big issues that limit the scalability of current computer systems
    is the disproportionate growth of processor capabilities and hard disk
    capabilities with respect to speed and performance. While processing power
    has grown dramatically, disk speeds have improved only slowly. So even the
    cpu speed is very high, the disk access (read and write) has been a limiting
    factor in applications / computer systems taking advantage of this power.
    Both disk read and write are issues. At the same time, the computer memory
    has also been growing dramatically. Given this, the log-based file system
    presented in this paper primarily focuses in on disk writes, leaving the
    disk reads to be taken care of by making use of increased main memory
    (caches).
    There were some interesting details provided while discussing the problems
    with current files systems. For example, in Unix FFS, it takes at least 5
    I/Os (each preceded by a seek) for creating a new file. When writing small
    files in such a system, less than 5% of the disk's potential bandwidth is
    used for new data; the rest of the time is spent in seeking. This is a big
    problem area and tackling this would give many orders of magnitude of
    performance improvement. And log-based file system described here (Sprite
    LFS) tries to do that. The paper first presents the considerations and
    defines the problem area that it tries to tackle: disk writes for small
    sized files.
    The fundamental idea of a log-structured files system is to improve write
    performance by buffering a sequence of file system changes in the file cache
    and then writing all the changes to disk sequentially in a single write
    operation. And the data is stored permanently in the log and there are no
    other structures on disk. There are two key issues in these systems: a)
    locating a file and reading it b) manage free space on disk (so that large
    extents of free space are available). The other main consideration is the
    restart ability of the system.
    The Sprite LFS has the following data structures: inode, inode map, indirect
    map, segment summary, segment usage table, super block, check point region,
    directory change log. Sprite LFS uses a combination of threading and copying
    for managing free space. The disk is divided into large fixed-size extents
    called segments (and each segment is made up of blocks). And a segment
    cannot be rewritten until the live data has been copied out. This process
    is called segment cleaning: (read segment(s) into memory, identify live data
    and write them to a smaller number of clean segments - this leaves the
    original segment(s) available for new writes). Segment summary block is
    used in the process. Next question is to identify the cleaning policies.
    This is where the authors introduce some interesting terms/factors like age
    sort, write cost, hot and cold access patterns. They try out a simulation
    in identifying these policies and conclude that the hot and cold segments
    must be treated differently by the cleaner; free space in cold segment in
    more valuable and come up with the cost-benefit policy. This leads to the
    data structure called segment usage table.
    Crash recoverability is handled in two ways: check points and roll-forward.
    The check point frequency is another important decision and a proper
    trade-off has to be taken considering cost of normal operation and faster
    recovery.
    The benchmarks were taken using small programs on Sun OS and Sprite LFS.
    Small-file performance in Sprite was very good compared to SunOS. The
    interesting point was the disk saturation in Sprite LFS was only 17%
    compared to 100% cpu. So Sprite will be able to take advantage of increate
    CPU power where as SunOS had 85% disk saturation.
    The system has borrowed many ideas from previous work, different storage
    management systems / database systems. Overall, I think that the authors
    clearly stated their goals (write, small files, better performance than
    Unix) and they described the system and explained the various results well
    to show that they indeed achieved their goal. It will be interesting to
    know how this system measures up in today's hardware.


  • Next message: David V. Winkler: "Review: The Design and Implementation of a Log-Structured File System"

    This archive was generated by hypermail 2.1.6 : Sun Feb 22 2004 - 19:52:55 PST