From: Song Xue (songxue_at_microsoft.com)
Date: Mon Feb 23 2004 - 17:52:51 PST
The paper titled "The Design and Implementation of a Log-Structured File
System" presents an implementation of log-structured file system whose
performance is an order of magnitude better than that of a traditional
file system. Log-structured file systems are based on the assumption
that disk traffic is dominated by writes. This is because files are
cached in main memory and that increasing memory sizes will make the
caches more and more effective and satisfying read requests. A
log-structured file system writes all new information to disk in a
sequential structure called the log. This approach increases write
performance dramatically by eliminating almost all seeks. In addition,
the sequential nature of the log also permits much faster crash recovery
as the most recent portion of the log needs to be examined.
The Traditional file systems have two general problems. Disks tend to
be accessed in a way that causes too many small accesses. And file
writes tend to be synchronous. Log-structured file system improves
write performance by buffering a sequence of file system changes in the
file cache and then writing all the changes to disk sequentially in a
single disk write operation in asynchronous manner.
From the implementation stand point, the most difficult issue is to
manage the free space on disk so that large extents of free space are
always available for writing new data. Both threading and copying are
used. The disk is divided in to large fixed-sized extents called
segments. Any given segment is always written sequentially from its
beginning to its end, and all live data must be copied out of a segment
before the segment can be rewritten. The log is threaded on a
segment-by-segment basis. A data structure called "segment summary
block" is used to make segment cleaning possible.
As to system recovery, both checkpoints and roll-forward are used to
speed recovery process and minimize the loss of information. A
checkpoint is a position in the log at which all of the file system
structures are consistent and complete. The simple approach after a
crash is to discard all changes after the newest checkpoint. However,
this may result in too much changes lost. A roll-forward algorithm is
used to recover as much changes that for sure have made to disk, while
maintaining the consistency of the file system
This archive was generated by hypermail 2.1.6 : Mon Feb 23 2004 - 17:52:58 PST