Due Date: 30 May 2001 6:30pm (The tenth (and last) week of class)
Turn-in procedure:
Please email grimes@cs.washington.edu before class on 30 May.
I will be accepting Word documents, in addition to Postscript, PDF, HTML, or plain text files.
Note that zipping up Postscript files is a good idea if you have large figures or rasterized images.
Please use the subject "CSE592: HW3 Submission", and in the text part of the message include
your name and student id.
- Which algorithm has higher variance: bagging or boosting? And which one
has higher bias? Why?
- Let the instance space be the rectangle {(x,y): 0 < x < 40, 0 < y < 30},
and let + and - represent positive and negative training examples,
respectively. Suppose you are given the following training set:
30
- +
20
y - + - +
10
- +
0
0 10 20 30 40
x
- a) Draw the Voronoi diagram corresponding to this training set,
using Euclidean distance.
- b) If this Voronoi diagram represents the true concept, and examples are
drawn with uniform probability from the rectangle {(x,y): 0 < x < 40,
0 < y < 30}, what is the probability of drawing a positive example?
- Suppose we have the following training set for y as a function of x,
where examples are in the form (x,y): {(0,0), (1,2), (2,3), (3,3),
(4,0), (5,1), (6,0)}. Suppose we model y=f(x) using locally weighted
linear regression, minimizing the squared error over the 2 nearest
neighbors. Draw a graph of the function y=f(x) implicitly induced.
- Let H be the set of hypotheses of the form x = x_0 v x = x_1, with
x_0, x_1 \in X (i.e., x_0 and x_1 are arbitrary members of the
instance space). What is the VC dimension of H?
- Question 6.3 in Han & Kamber, with modified part a):
- a) Find all frequent itemsets using Apriori.
- b) Same as Han & Kamber.
|