CSEP 546 - Data Mining - Spring 2003 - Homework 3

Due Date: 4 June 2003 6:30pm

Turn-in procedure: Please email parag@cs.washington.edu before class on June 4. Any of the Word, Postscript, PDF, HTML, or Plain text formats should be fine.

Please use the subject "CSEP546: HW3 Submission", and in the text part of the message include your name and student id.

You can also submit a hardcopy of your work at the beginning of the class on June 4.

Notes: All homeworks are to be done individually.

Suppose the instance space is the square 0 < X < 8, 0 < Y < 8, and you have a training set composed of positive examples at (2, 2), (2, 4), (2, 6), (6, 2), (6, 4), and (6, 6), and negative examples at (4, 2), (4, 4) and (4, 6).
1. Draw the Voronoi diagram of this training set, using Euclidean distance.
2. Suppose you measure the error rate of the 3-nearest neighbor algorithm on this training set using leave-one-out cross-validation (i.e., you take out each example in turn, and predict its class using the other examples). What would the measured error rate be?
3. Suppose there is no noise in this domain. What can you say about the error rate of the nearest-neighbor algorithm in this domain, in the limit of an infinite training set?
4. Suppose you apply the 3-nearest-neighbor algorithm with backward elimination. Which features would be eliminated (X, Y, both, or neither)? Why?
Consider a rule induction algorithm that stops after finding m rules. Does the bias of this algorithm increase or decrease with m? And the variance? Why?
Consider learning a perceptron on nine-dimensional linearly separable data. How many training examples do you need to guarantee with 99% confidence that the learned perceptron has true error of at most 10%?
What is the VC dimension of the set of intervals over the real numbers? (i.e., each possible hypothesis is an interval of the form a < x < b, where a, x and b are real numbers.)
Why must any subset of a frequent itemset also be frequent?
Han & Kamber - 6.4