CSE logo

University of Washington Department of Computer Science & Engineering


 CSE 573 – Artificial Intelligence - Autumn 2004

 

 

Mini Project 2

[ CSE 573 2004 Autumn Home | Problem Sets | Mini Project 1 ]

All groups: Please signup to talk to Dan.

Project2 Reports page

In all respects, project 2 will be similar to project 1, so please refer to the previous description for overview comments on what is desired. I'll be much more brief in this description.

As before, projects will be done in groups of two, but I wish everyone to reorganize into different groups unless you have an exceptional circumstance and get my permission.

For project 2, I am even more open to groups proposing their own ideas. I would also like to encourage groups to meet with me early on to brainstorm about their projects. The suggestions below seem like like good ones, however, and if they don't appeal, hopefully they will give you ideas.

  1. Use several machine learning algorithms to learn a SPAM classifier. In 473 last Spring, I had the students try three methods: decision trees, an ensemble of decision trees and a Naive Bayes classifier to do this. See problems 4 and 5 here There are several interesting ways to improve upon the basic assignment. Read the following paper (Jason D. Rennie, Lawrence Shih, Jaime Teevan, David Karger: "Tackling the Poor Assumptions of Naive Bayes Text Classifiers." ICML 2003: 616-623 which is available here) and see if it can lead you to improvements on your classifier. (Note: Dan hasn't thought this through; the paper is high on his stack of papers to read, but he hasn't gotten to it yet.)
  2. The Placelab framework uses WiFi signatures to estimate a user's location in terms of longitude and latitude coordinates. This data is timestamped and logged periodically (e.g. every 2 seconds). Here is some

    sample data. Can you use ML and Bayesian techniques to predict higher level descriptions of behavior?

    You might use timestamp information and a clustering algorithm such as k-means (AIMA page 845 but see also 725) to generate symbolic locations. Then perhaps you could learn a markov model (or dynamic Bayesian network, or hierarchical MM, or hierarchical DBN, or...) to predict the user's behavior. You could also try smoothing using a relational markov model as described by Sanghai et al. (The abstraction hierarchy might include terms like restaurants > cafes > starbucks. An instance might be Starbucks-on-42-and-the-ave). This is a hot area of research. Here are some papers:

  3. For students who have taken or are interested in computer vision, write a program which solves Captchas. These are a special type of Turing test designed to keep software robots from using up resources of MSN, Yahoo, Gmail and others (See this overview news story or (best) the official Captcha site. See also problems for the visually impaired.

    Alternatively propose your own captcha. Can you think of one that isn't visual? Can you generate a large number of tests?

Deadlines
  1. Monday (November 15) at midnight: Email Miao with the name of your team, the names of the teammates, and a preliminary project plan (which project or direction you are thinking).
  2. Between 11/15 and 12/3 I wish to meet with each group to discuss details; see the signup-sheet.
  3. Monday (December 13) at 9:00am: Final report, code due. Email code and report (.doc or .pdf) to Miao and give each of us one printout.

Computer Science & Engineering Department
University of Washington
PO Box 352350

Seattle, WA 98195-2350 USA