Theory seminar (Tuesday May 2 1:30 -- 2:30 in CSE 403)


TIME: 11:00am--noon,  May 9, 2006

PLACE: CSE 503

TITLE:  Model free stability results in clustering
		     and
Intransitivity and machine learning -- some open problems

SPEAKER:  Marina Meila,  University of Washington

ABSTRACT:


This is a two part talk. The first (and longest) part will present
some new results on clustering. The second part (cca 15 minutes) will
describe some problems that have captured my interest recently.

Model free stability results in clustering:

If we have found a "good" clustering C of data set X, can we prove
that C is not far from the (unknown) best clustering C* of this data
set? Perhaps surprisingly, the answer to this question is sometimes
yes. We can show bounds on the distance( C, C* ) for two clustering
cost functions: the Normalized Cut and the squared distance cost of
K-means clustering. These bounds exist in the case when the data X
admits a "good" clustering for the given cost.

Intransitivity and machine learning -- some open problems: 

This reasearch deals with making a choice between m>2 alternatives,
like in multiway classification, or in decision making, from a set of
pairwise preferences between objects (or the outputs of a set of
binary classifiers). The graph of preferences is not necessarily
transitive. What is then the "best choice"? This is just one example
of how intransitivity can arise in multiway decision problems. I will
describe some other sources of intransitivity. While these sources
have been long identified, little has been done to construct
algorithms that "invert the models", i.e finding the hidden
combination of causes that produced a set of observed (intransitive)
preferences. [Joint work with Jeff Bilmes]