Steam-powered Turing Machine University of Washington Department of Computer Science & Engineering
 CSE573 Course Problem Sets
  CSE Home   About Us    Search    Contact Info 

Administrivia
 Home
 Using course email
 Email archive
 Policies
Content
 Topic Overview
 Slides & Reading
Assignments
 Reading Reports
 Problem sets
   
  • Problem Set 1 doc pdf Due 10pm October 11, 2008
    Sample Solutions doc pdf
  • Problem Set 2 doc pdf Due 8am October 28, 2008
    Sample Solutions doc pdf
  • Problem Set 3 doc pdf Due 4:40pm November 10, 2008
    Sample Solutions doc pdf

Mini-project ideas. I will give a few suggestions below, but if you have another idea, please feel free to run it by Dan (by Thursday).

  • One option would be to write a program to solve the counterfeit coin problem on the midterm which we discussed in class. See also the midterm solutions. Just solving that problem as posed on the midterm is ok. For extra credit, it would accept a variable number of coins as input, but the assumption that just on coin is counterfeit (heavy or light) may be built into the program. More extra credit for increased generalization (but no need to go crazy here). For experiments you could compare the run time with different heuristics or with and without a heuristic.
  • Build a DPLL and or a WalkSAT satisfiability solver. The book has plenty of information and page 235 has citations for further reading. For experiments, you could compare DPLL with WalkSAT or plot how they each scale as problems grow harder. You might consider "hard" problems (remember the clause to variable ration for random problems).
  • Build a spam filter using one or more machine learning algorithms. You can implement naive Bayes, decision trees, or look for the Weka ML package http://www.cs.waikato.ac.nz/ml/weka/ and try a few different algorithms. Experiments could compare the effect of different learning algorithms or use cross validation to measure the precision and recall of your filter.For more ideas, see problem 5 in http://www.cs.washington.edu/education/courses/cse473/04sp/hw/hw3/
    Look at assignment 3 below for data: http://www.cs.washington.edu/education/courses/cse473/04sp/hw/ and also click the 'complete assignment' and start reading at "Assignment 4"
    A Web search is likely to turn up better, larger and more comprehensive datasets so I'd probably advise that instead of my very onld one. For example, I found this URL which looks promising: http://mobblog.cs.ucl.ac.uk/2008/01/21/spam-dataset/ And here's a URL for the Fifth Conference on Email and Anti-Spam (CEAS 2008) http://www.ceas.cc/ There's also the 'Web Spam Challenge' which is likely different but might be interesting.
  • Write a program which learns Bayes nets. Start by assuming the structure is fixed (and input) and learn the parameters from data as in the previous problem set. Then try exploring the space of structures using local search. In addition to the text, a good reference is http://research.microsoft.com/research/pubs/view.aspx?msr_tr_id=MSR-TR-95-06
Hand in a soft copy of your code along with instructions on how to run it. A page of description on how it works would also be appreciated and a page of evaluation (e.g. one or two experiments demonstrating scaling behavior or the effect of various switches). If you end up using third party code in your project, be sure to explain clearly what you used and what you wrote yourself.

Project reports:


CSE logo Department of Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX