There are K clusters C1,
, CK with means
m1,
, mK.
The least-squares error is defined as
Out of all possible partitions into K
clusters,
choose the one that minimizes D.
Why dont we just do this?
If we could, would we get
meaningful objects?