CSE 551 Project Ideas
- "Training" a file system: it is fairly common for a
workload to cause a very predictable set of blocks to be read
from a disk; for example, during the boot sequence of an OS,
nearly the same set of blocks is read as the kernel, the boot
scripts, and the initial daemons get loaded and executed.
Modify the file system for Linux (or your favourite OS) so that
you can "record" a predictable sequence, and have the file
system organize that sequence of blocks sequentially on disk.
One way to do this would be to maintain a linear shadow
structure on disk for this predictable sequence, and to
invalidate elements of the shadow if modifications happen to
those blocks.
- File system measurements: repeat/update the
measurements from the 1991 "Measurements of a distributed
filesystem" paper, or the Roselli paper "A Comparison of
File System Workloads", but for a modern, different
environment (such as our undergrad Win2K labs, or a web
server's file system, or ...?)
- User-level file system: do for file systems what
Alpine did
for network stacks.
- Port user-mode
linux to some other operating system,
such as Win2K. (This is probably really hard.)
- Personal archival web search engine: disk space is
cheap; in fact, it's so cheap that you can probably afford to
archive every web page that you ever visit. Implement a
client-side web proxy that does this, and which also builds
a searchable index of these pages (including making a timestamp
of the pages a searchable attribute, so people can issue the
query "show me the page containing the word foo that
I was looking at yesterday afternoon").
(You may want to talk with Soumen Chakrabarti about this one,
http://www.cse.iitb.ernet.in:8000/~soumen)
- OS performance metrics via queries: the berkeley packet
filter (BPF) allows one to specify (using a simple declarative query
language) a filter over the stream of packets that flows past
a network card. Add a bunch of instrumentation to various
parts of an OS (such as the page replacement algorithm in the
VM, the process scheduler, the network stack, the file system,
etc.) that would be useful to gather performance statistics
from the OS, then devise and implement a BPF-like mechanism
to issue queries to this set of instrumentation. (Make sure
you look at Andy Begel's BFP+ work from Berkeley.)
- General purpose reliable group communication support for
ad-hoc/peer-to-peer mode operation of 802.11 networks:
802.11b wireless ethernet cards support an "ad-hoc" or
"peer-to-peer" mode of communication, in which cards directly
talk to each other rather than through an access point. Of
course, not every card can see every other card, but it would
still be nice to support message routing to the transitive
closure of visible cards. Build a set of software primitives
(such as reliable message delivery, multicast message delivery,
group membership protocols, etc.) for this environment. You
will want to understand ISIS, Horus, and Ensemble from Cornell,
and probably also get to know what the startup
http://www.synchropoint.com is doing.
- collaborative-filtering style web performance
measurements: many companies (such as
Keynote) provide
services by which they report user-perceived performance
metrics for customers' web sites. Put these companies out
of business by devising a mechanism (perhaps a client-side
proxy) by which a large community of web clients can
collaboratively build a repository of this performance data.
Explore the ramifications of mutual distrust in this
environment.