CSE 551 Spring 2005 Project Ideas

CSE 551 Project Ideas, Spring 2005

[Almost by definition, any idea of your own trumps any idea of ours! Don't feel constrained by these!]

Spyware patterns

Spyware programs interact with the operating system through system calls and extensibility points, and with remote sites over the Internet. See if you can measure the behavior of spyware programs to detect "commonly occuring" system call or network patterns that distinguish spyware from non-spyware. It may be that you have to sub-class spyware into categories like keyloggers, adware, trojan installers, and so forth in order to come up with discriminative patterns.

We have gathered a reasonably large pool of known spyware programs and known-to-be-not-spyware programs that you can use to do your analysis.

Contact: Alex Moshchuk and Steve Gribble

Code page reference monitor

(The difficulty of this project depends on how easy it will be to slide this functionality into Linux, so you're advised to do some exploration before commiting to it.

Modify Linux so that whenever a code page is demand-paged into virtual memory, a routine that you control (either in the kernel or in a user-level program) is invoked. This routine should be able to inspect the code before it is actually bound into virtual memory. Given this, you can implement a few things:

Perform an md5 hash of the code page, and keep track of which code pages you are demand paging in that you have seen before and which code pages are "new."
Do some validation on new code pages. For example, look up those code pages in a "known threats" database to determine if the code page is associated with spyware, a worm, or some other malware.
Determine if the code page "belongs" to the process that is demand paging it in. You could imagine when a product vendor ships software, it also ships a manifest of md5 hashes of code pages that are valid.
Use the system for a while, and quantify the rate at which you see "new" code pages as a function of how long you have been using the system. Also, measure the performance cost of doing this monitoring.

Demand paging might not be the only way that new assembly language code is introduced into a system; for example, a Java JIT will emit code on-the-fly. Figure out how to keep track of these other sources of new code.

Contact: Steve Gribble

Unusable security

Implement usability attacks on common Web security mechanisms and assess how effective they are in practice. Classic E2E security aims to prevent entities on the path from successfully attacking the confidentiality and authenticity of communications. With HTTPS (SSL), for example, I should be able to use the Web from any public WiFi hotspot without being vulnerable to unscrupulous hotspot operators. Except that none of this probably works in practice for usability reasons!

Consider this attack: the unscrupulous WiFi gateway downgrades HTTPS links in HTTP documents to plain HTTP, completing the HTTPS connection itself to get the content and relaying them to you via plain HTTP. Since you typically contact web sites via a public HTTP link you have no solid foundation for an E2E chain-of-trust. So you complete your Amazon purchase without realizing you used HTTP instead of HTTPS and there is no error indication, e.g., no certificate problems. Yet the WiFi gateway has captured your credit card details for its own use.

Implement and test this or another attack. The most difficult part of the project is to determine how to run a real study without abusing the participants; brainstorm this first. If you can solve it, I bet you can obtain a highly visible research result.

Contact: David Wetherall

8.0211B Software radio

Software radio is an emerging technology in which most of the communications functions are written in software rather than cast in hardware -- your C code manipulates digitized signal representations that are transmitted or received by an antenna. It has the potential to implement new kinds of radios that are less constrained by traditional hardware implementation strategies and better adapted to their environment. And now it's possible: we have obtained a small number of USRP boards (http://home.ettus.com/usrp/usrp_guide.html) for experimentation.

The project is to develop 802.11B-like PHY and MAC layers as user-level C code that runs on this platform, slowed down by 100X to make the task tractable. The goal is to understand the systems issues (scheduling, resource consumption, division of functionality) and tradeoffs that enable new communication opportunities.

Contact: David Wetherall

Multi-AP Wireless Clients

In 802.11 as it is commonly used, a client associates itself with an access point (AP) and uses that AP as its point of connection to the Internet. There are times, however, when it may be beneficial for a client to bind to multiple APs. This could increase reliability if the access link of one client fails, improve performance by striping data across multiple Internet access links, or reduce jitter by masking short term outages, particularly during handoffs as clients move. Modify the MadWifi (Atheros chipset) driver to allow a client to bind to multiple APs and experiment to evaluate the advantages. (A more limited alternative is to use two wireless cards in a laptop.)

Contact: David Wetherall, Charlie Reis.

Uber-AP Home Gateway

Home wireless/cable/DSL gateway boxes from LinkSys and NetGear run Linux underneath, and there is an increasing body of knowledge on how to hack these boxes to install your own functionality. They are a cheap, disruptive technology. Get some of these boxes and build your own, improved AP of some kind to use at home. (We would prefer you to use the NetGear boxes since the wireless driver is open source.) Charlie Reis has ideas about how to pair neighboring gateways for improved performance or reliability with legacy clients that are similar to the multi-AP scenario above. You might have other ideas.

Contact: David Wetherall, Charlie Reis.

Idea generator

When all else fails, this is how to choose a topic