Project 5 - File Systems
Administrivia: For this project, you may work with the same
partner you had in project 4. However, if you'd like to work with someone new, you are welcome
switch partners, but you need to let Valentin know by Friday, May 30 in
an email.
Out: Wednesday, May 28
Due: Friday, June 6, 11:59pm
Assignment Goals
- To understand the problems that file system implementations must solve,
and the range of approaches that might be taken
- To practice design (in this case of file systems)
- To experience working in a "more sophisticated" environment:
complex software, a variety of tools, and teams of programmers
- To gain further experience with concurrent software
Overview
The starting point for this assignment is a simplified file system, cse451fs,
the design of which imposes strict limits on both the number of files that can
be stored and the maximum size of any one file. In particular, no matter
how big a disk you might have, this file system can hold only about 8,000
distinct files, no file can be larger than 13KB, and file names cannot be longer
than 30 characters. These restrictions
result from the choice of on-disk data structures used to find files and the
data blocks of a given file, that is, the superblock and inode
representations.
Here are the major steps involved in this assignment:
- Make sure that you, individually, understand the mechanical aspects of
the "development environment" you'll be working in. These
include how to build the file system code, how to configure the raw disk
device provided by VMware to host your file system, and how to run and test
your file system. A description of these mechanical aspects is here.
- Of the three limitations cited above you will need to improve the
following two:
-
Increase the maximum size of files.
-
Allow for longer file names.
Design how you want to implement these file system modifications on disk:
how you will represent your directories on disk, how file data is indexed,
etc. There can still be a limit on any of these properties, but your
improvement needs to be more than simply altering a program constant.
- Alter the skeleton code (/cse451/doug/Proj5FS-V1.0.0.tar.gz)
to implement your file system. There are two major
components to this. One is that the user level program
mkfs.cse451fs
must be changed to initialize the raw disk device with a valid, empty file
system using your new on-disk data structures. The other is to change
the file system source (fsSource/) itself.
Details
- While real file systems are very concerned with performance, in your
implementation you can largely ignore it. That is, do not spend a
great deal of effort to produce a faster implementation. (For one
thing, because we're running on virtual machines on top of Windows, it's
unlikely you'd be able to measure much difference.)
- A description of the skeleton version of the cse451fs file system is
here. (necessary reading)
- A description of the ext2 file system and vfs (Virtual File System)
is here.
(strongly recommended)
- A description of how dynamically loaded modules are handled in Linux is here.
(not required reading but may answer odd questions that arise)
Hints/Starting Points
- Large Files:
- An important function for creating and accessing the blocks of a file
(that you will almost certainly need to modify) is
get_block() in super.c.
- Look at cse451_truncate() in file.c in order to handle file truncations
(e.g., deleting a file uses this). You don't need to worry about
this until you are sure that you can create and access your files.
- Long File Names:
- All the functions that you will need to modify for your kernel module
are in dir.c.
- Start with cse451_add_entry() and cse451_readdir() to create
and read directory entries with long file names. If listing with ls
doesn't appear correct, check how the directory structure looks on disk
(using hexdump
or xxd).
- General:
- If you modify any of the data structures in cse451fs.h, you
will probably need to modify mkfs.cse451fs in addition to the
kernel module sources. Otherwise you probably
can leave mkfs alone.
Schedule and What to Hand In
By Friday, June 6 at 11:59 PM, please turn in a file that includes the
following.
- Describe the design for your file system modifications.
This might include a discussion of other approaches you considered but
rejected.
- What concurrency-related issues did you encounter? How did you deal with them?
- What methodology did you follow in order to test your file system (for functionality)?
- Does your implementation work? If not, what parts work and what parts don't?
How would you fix it if you had more time?
- What do you like best about your design? What do you like least
about it? How would you improve your design?
The report should be no more than 2 pages long. The acceptable file format is plain
(ASCII) text. Include the usernames of both partners in the
filename, e.g., writeup-doug-valentin.txt
.
You will be graded primarily on your write-up. We will be looking at
the clarity and viability of your ideas, depth of
your implementation, and completeness of your testing, based on your report.
However, please do turn in a copy of your code as well, in case we need to refer to it.
Turnin
You will be turning in 4 files: a tar.gz file containing all of your
modified sources, your compiled
mkfs.cse451fs and cse451fs.o files, and a single write-up
file for both partners.
To create the source archive file, use the following command: tar -cvzf
sources.tar.gz Proj5FS/
Use the turnin(1L) program under project name proj5 by
11:59pm on the day it is due. Note: turnin will not work on coredump/spinlock, so
you'll need to use one of the general-purpose machines (sumatra, fiji, ceylon,
or tahiti).