Project 4 - File Systems

Administrivia: For this project, you will work with the same partners you had in project 3
Out: Wednesday, March 1
Due: Friday, March 10 at 6:00 pm

Assignment Goals

Background

Last year, this was a programming assignment, in which you modified the skeleton of a file system, compiled it with the linux 2.4 kernel, and tested it in VMware. Because of the department-wide change from Fedora Core 2 to Fedora Core 4, and the accompanying change from gcc 3 to gcc 4, the 2.4 kernel no longer compiles on department machines. Unfortunately, the filesystem skeleton was created from 2.4 code and designed to integrate with the 2.4 kernel, and would need to be substantially restructured to work with the 2.6 kernel (which compiles fine on our machines, as you saw in project 1).

Since actual implementation is not an option this quarter, this project is now a pencil-and-paper design exercise, delving in to the details of what needs to be changed. On one hand, this is easier, since you don't need to test it; on the other hand, it's harder, since you can't test it.

Overview

The starting point for this assignment is a simplified file system, cse451fs (in /cse451/projects/cse451fs.tar.gz), the design of which imposes strict limits on both the number of files that can be stored and the maximum size of any one file.  In particular, no matter how big a disk you might have, this file system can hold only about 8,000 distinct files, no file can be larger than 13KB, and file names cannot be longer than 30 characters.  These restrictions result from the choice of on-disk data structures used to find files and the data blocks of a given file, that is, the superblock and inode representations. 

Your assignment is to design modifications to the file system that would achieve the following:
Design how you would implement these file system modifications: how you will represent your directories on disk, how file data is indexed, etc.  There can still be a limit on any of these properties, but your improvement needs to be more than simply altering a program constant. You must support a maximum file size of at least 256KB, which can be achieved with a single-indirect strategy as discussed in lecture.

Once you have accomplished these two tasks, you should tackle a third task:

In particular, design a triple-indirect approach, as discussed in lecture, and describe its implementation.

You're not actually implementing anything, but you still need to be intimately familiar with the code. You ought to identify all of the places that the code would need to actually change if you were to implement your modifications.

What we want

At the highest level, what we want is clear evidence that you understand file systems in general and the cse451fs filesystem in particular. Concretely, we want a discussion of how you'd accomplish the extensions we're asking for. For each extension, you should say at a high level what needs to change and why, then go in to the specifics of what needs to change -- where, in which source file. You are producing a detailed design document. It's prose, not code. (We will evaluate the clarity of your reasoning and of your presentation, as well as the quality of your solution.) It is fine to include code segments in your document -- you can include an original function for context if needed, and add clear pseudocode for what needs to happen in what order; you can add lines of code like what you would put in if actually implementing the change; but include clear comments and a design description. We need to be able to read your document and assess correctness, without doing hand-simulation! It is not acceptable to simply write pseudocode in the form of "loose-syntax C".

Note that unlike "design documents" that you may have done in other projects and other courses, we don't just want the top few issues -- we want all of the issues. Every time you'd change something of any significance in the code in a real implementation, we want to hear about it in the writeup.

Details

Hints/Starting Points

Writeup file

Please turn in a file called writeup that includes the design document described above, and answers to the following:

  1. What other approaches did you consider and reject in your design? (This should be no more than a paragraph or maybe two.)
  2. What concurrency-related issues does a file system have to deal with? You probably didn't deal with any of it directly when implementing your extensions, but what did you notice when looking at the rest of the code?
  3. What would you have to do to make it efficient, or where are the main efficiency hurdles?

Aim to fit this report in something like 5 pages, not more than 10. Use whatever format you want, so long as it's searchable. So text, HTML, Word are all fine, but pdf or photos of whiteboard scribbles, not so much. Whatever it is, hand it in using turnin to project4