K-Means++
Can we prevent arbitrarily bad local minima?
1. Randomly choose first center.
2. Pick new center with prob. proportional to:
   (contribution of p to total error)
3. Repeat until k centers.
expected error = O(log k) * optimal
Arthur & Vassilvitskii 2007