CSE Building University of Washington Computer Science & Engineering
 CSE 451: Project 4
  CSE Home   About Us    Search    Contact Info 

 Main
 Resources
 Lecture slides
 Homework
 Projects
 Quizzes
 Exams
 Administration
 Bulletin Board
 Anonymous feedback
Descriptions
 Project 1
 Project 2
 Project 3
 Project 4
   

Project 4 - File Systems

Administrivia: This project should be done in groups of 2. We'll assume the same groups as the last project. If your group has changed, please let Ilya know.
Out: Saturday, May 22nd
Due: Friday, June 4th

Assignment Goals

  • To understand the problems that file system implementations must solve, and the range of approaches that might be taken
  • To practice design (in this case of file systems)
  • To experience working in a "more sophisticated" environment: complex software, a variety of tools, and teams of programmers
  • To gain further experience with concurrent software

Overview

The starting point for this assignment is a simplified file system, cse451fs, the design of which imposes strict limits on both the number of files that can be stored and the maximum size of any one file. In particular, no matter how big a disk you might have, this file system can hold only about 8,000 distinct files, no file can be larger than 13KB, and file names cannot be longer than 30 characters. These restrictions result from the choice of on-disk data structures used to find files and the data blocks of a given file, that is, the superblock and inode representations.

Here are the major steps involved in this assignment:

  1. Make sure that you, individually, understand the mechanical aspects of the "development environment" you'll be working in. These include how to build the file system code, how to configure the raw disk device provided by VMware to host your file system, and how to run and test your file system. A description of these mechanical aspects is here.

  2. Of the three limitations cited above you will need to improve the following two:
    • Increase the maximum size of files.
    • Allow for longer file names.

    Design how you want to implement these file system modifications on disk: how you will represent your directories on disk, how file data is indexed, etc. There can still be a limit on any of these properties, but your improvement needs to be more than simply altering a program constant.

    If you still have time left over, you may do additional improvements for up to 10 points of extra credit. Some ideas:

    • Make the maximum length of a file name arbitrarily long.
    • Make the max file size arbitrarily large.
    • Increase the number of distinct files that the file system can handle.

  3. Alter the skeleton code (/cse451/projects/cse451fs.tar.gz) to implement your file system. There are two major components to this. One is that the user level program mkfs.cse451fs must be changed to initialize the raw disk device with a valid, empty file system using your new on-disk data structures. The other is to change the file system source (fsSource/) itself.

Details

  • While real file systems are very concerned with performance, in your implementation you can largely ignore it. That is, do not spend a great deal of effort to produce a faster implementation. (For one thing, because we're running Linux inside a virtual machine, it's unlikely you'd be able to measure much difference.)
  • A description of the skeleton version of the cse451fs file system is here. (necessary reading)
  • A description of the ext2 file system and vfs (Virtual File System) is here. (strongly recommended)
  • A description of how dynamically loaded modules are handled in Linux is here. (not required reading but may answer odd questions that arise)

Hints/Starting Points

  • Large Files:
    • An important function for creating and accessing the blocks of a file (that you will almost certainly need to modify) is get_block() in super.c.
    • Look at cse451_truncate() in file.c in order to handle file truncations (e.g., deleting a file uses this). You don't need to worry about this until you are sure that you can create and access your files.
  • Long File Names:
    • All the functions that you will need to modify for your kernel module are in dir.c.
    • Start with cse451_add_entry() and cse451_readdir() to create and read directory entries with long file names. If listing with ls doesn't appear correct, check how the directory structure looks on disk (using hexdump or xxd).
  • General:
    • If you modify any of the data structures in cse451fs.h, you will probably need to modify mkfs.cse451fs in addition to the kernel module sources. Otherwise you probably can leave mkfs alone.

The Writeup

The writeup should be hardcopy, handed in at the beginning of lecture on the due date. This writeup is less free form than previous ones have been. In particular, please adress the following:
  1. Describe the design for your file system modifications. This might include a discussion of other approaches you considered but rejected.
  2. What concurrency-related issues does a file system have to deal with? You probably didn't deal with any of it directly when implementing your extentions, but what did you notice when looking at the rest of the code?
  3. What methodology did you follow in order to test your file system (for functionality)?
  4. Does your implementation work? If not, what parts work and what parts don't? How would you fix it if you had more time?
  5. What do you like best about your design? What do you like least about it? How would you improve your design?
  6. If you did any extra credit, describe what you did. Make sure to describe how you implemented it, and how it solves the problem you set out to fix.

As a basic guideline, 4 pages would be a good total length for your report. The writeup is worth a good chunk of the grade, so don't skimp on it.

Electronic Turnin

You will be turning in 3 files: a tar.gz file containing all of your modified sources, your compiled mkfs.cse451fs and cse451fs.o files.

To create the source archive file, do the same as you've done in the past, just do a make dist in the top level directory.

Use the turnin(1L) program under project name project4 before class on the day it is due. Note: turnin will not work on coredump/spinlock, so you'll need to use attu.


CSE logo Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
[comments to kasiaw]