From: Chuck Reeves (creeves_at_windows.microsoft.com)
Date: Mon Feb 23 2004 - 15:08:11 PST
The paper, "The Design and Implementation of a Log-Structured File
System" was written in 1992 by Roseblum and Ousterhout at the University
of California at Berkeley. It describes the design and implementation of
a permanent log-structured file system named "Sprite". The approach
anticipates future increases in the size of main memory caches and
predicts that disk traffic will be dominated by writes. As a result, the
design focuses on optimizing the write performance. Also, the systems
focuses on the efficiency of small file accesses.
Buffering of information yet to be written to disk is used to improve
the efficiency of writing. It surprised me how willing the authors were
to tolerate lost data on system crashes. The log structure divides the
disk into fixed-size segments. These segments are the region of reuse.
The segments are threaded together to identify consecutive blocks of a
file.
Segments are recovered through a process called cleaning. Segment
specific utilization is tracked through the segment summary block
residing on a part of each segment. The decisions made on how to
reorganize the data on cleaned segments is called the cleaning policy
and is (along with the writing of new data) the primary indicator of
performance.
The cleaning algorithm design aspires to reach a point where the
segments are all nearly full or empty. Much of the document is dedicated
to optimizing this behavior. Segment dimenstions include: % utilization
and the last time modified. These statistics are eventually tracked in
the segment usage table which allows the cleaner to identify the best
candidates for cleaning quickly.
I probably didn't understand the discussion on checkpoint regions and
the roll-forward mechanism. But it seemed like it was describing a crash
recovery mechanism that allowed unfinished write activity to be commited
subsequent to the each crash.
I thought the contrast between temporal and logical locality was
paticularly insightful and would appreciate more discussion on this
topic.
Additionally, I'd like to understand how closely this behavior emulates
that of databases.
Chuck Reeves, creeves_at_microsoft.com
Microsoft | Windows | Directory Services
This archive was generated by hypermail 2.1.6 : Mon Feb 23 2004 - 15:08:25 PST