Due Date: 4 June 2003 6:30pm
Turnin procedure: Please email parag@cs.washington.edu before
class on June 4. Any of the Word, Postscript, PDF, HTML, or Plain text
formats should be fine.
Please use the subject "CSEP546: HW3 Submission", and in the text
part of the message include your name and student id.
You can also submit a hardcopy of your work at the beginning of
the class on June 4.
Notes: All homeworks are to be done
individually.
 Suppose the instance space is the square 0 < X < 8, 0 < Y < 8,
and you have a training set composed of positive examples at
(2, 2), (2, 4), (2, 6), (6, 2), (6, 4), and (6, 6), and negative
examples at (4, 2), (4, 4) and (4, 6).
 Draw the Voronoi diagram of this training set, using Euclidean
distance.
 Suppose you measure the error rate of the 3nearest neighbor
algorithm on this training set using leaveoneout crossvalidation
(i.e., you take out each example in turn, and predict its class
using the other examples). What would the measured error rate be?
 Suppose there is no noise in this domain. What can you say
about the error rate of the nearestneighbor algorithm in this
domain, in the limit of an infinite training set?
 Suppose you apply the 3nearestneighbor algorithm with backward
elimination. Which features would be eliminated (X, Y, both,
or neither)? Why?
 Consider a rule induction algorithm that stops after finding
m rules. Does the bias of this algorithm increase or decrease with m?
And the variance? Why?
 Consider learning a perceptron on ninedimensional linearly
separable data. How many training examples do you need to
guarantee with 99% confidence that the learned perceptron has
true error of at most 10%?
 What is the VC dimension of the set of intervals over the real
numbers? (i.e., each possible hypothesis is an interval of the form
a < x < b, where a, x and b are real numbers.)
 Why must any subset of a frequent itemset also be frequent?
 Han & Kamber  6.4
