|
|
|
|
Problem Set #6
Due: December 10th (Last Day of Class)
Reading: Chapter 18.
Assignment
Implement a decision tree learning algorithm and apply it to the following dataset: ftp://ftp.ics.uci.edu/pub/machine-learning-databases/mushroom/
Writeup
Hand in a write-up conforming to the following guidelines, and your code (double sided, two columns, landscape mode). Please note, you must be clear and concise in your write-up, as your write-up is limited to one page, single sided, 12-point font, not including extra credit (which may go onto a second page).
- Describe how you handled missing attributes. (1 point)
- What is the termination criterion for your learning process? (1 point)
- Apply your learning algorithm to roughly 3/4 of the mushroom dataset. Clearly describe what is being learned (ie, a Boolean formula in DNF that corresponds to your decision tree). Also, explain in English one of the rules that was learned (3 points)
- Test your algorithm on the remaining 1/4 of the data and report the accuracy on the test. (3 points)
- What was the accuracy on the training data? How does the training data accuracy compare with the test data accuracy? Briefly explain any differences you see. (2 points)
- Describe anything you would like considered for extra credit. You could try the algorithm on another data set in the UCI ftp site. You could enhance the algorithm in various ways and report on experiments showing the impact on performance. (up to 5 EC points)
- Attach code in format mentioned above.
|