Notes
Slide Show
Outline
1
Segmentation and Clustering
  • Today’s Readings
    • Forsyth & Ponce, Chapter 7
    • (plus lots of optional references in the slides)
2
From images to objects
  • What Defines an Object?
    • Subjective problem, but has been well-studied
    • Gestalt Laws seek to formalize this
      • proximity, similarity, continuation, closure, common fate
      • see notes by Steve Joordens, U. Toronto
3
Extracting objects
  • How could this be done?
4
Image Segmentation
  • Many approaches proposed
    • cues:  color, regions, contours
    • automatic vs. user-guided
    • no clear winner
    • we’ll consider several approaches today
5
Intelligent Scissors (demo)
6
Intelligent Scissors [Mortensen 95]
  • Approach answers a basic question
    • Q:  how to find a path from seed to mouse that follows object boundary as closely as possible?


7
Intelligent Scissors
  • Basic Idea
    • Define edge score for each pixel
      • edge pixels have low cost
    • Find lowest cost path from seed to mouse
8
Path Search (basic idea)
  • Graph Search Algorithm
    • Computes minimum cost path from seed to all other pixels
9
How does this really work?
  • Treat the image as a graph
10
Defining the costs
  • Treat the image as a graph
11
Defining the costs
  • c can be computed using a cross-correlation filter
    • assume it is centered at p

  • Also typically scale c by its length
    • set c = (max-|filter response|)
      • where max = maximum |filter response| over all pixels in the image
12
Defining the costs
  • c can be computed using a cross-correlation filter
    • assume it is centered at p

  • Also typically scale c by its length
    • set c = (max-|filter response|)
      • where max = maximum |filter response| over all pixels in the image
13
Dijkstra’s shortest path algorithm
14
Dijkstra’s shortest path algorithm
15
Dijkstra’s shortest path algorithm
16
Dijkstra’s shortest path algorithm
17
Dijkstra’s shortest path algorithm
  • Properties
    • It computes the minimum cost path from the seed to every node in the graph.  This set of minimum paths is represented as a tree
    • Running time, with N pixels:
      • O(N2) time if you use an active list
      • O(N log N) if you use an active priority queue (heap)
      • takes fraction of a second for a typical (640x480) image
    • Once this tree is computed once, we can extract the optimal path from any point to the seed in O(N) time.
      • it runs in real time as the mouse moves
    • What happens when the user specifies a new seed?
18
Segmentation by min (s-t) cut [Boykov 2001]
  • Graph
    • node for each pixel, link between pixels
    • specify a few pixels as foreground and background
      • create an infinite cost link from each bg pixel to the “t” node
      • create an infinite cost link from each fg pixel to the “s” node
    • compute min cut that separates s from t
    • how to define link cost between neighboring pixels?
19
Grabcut    [Rother et al., SIGGRAPH 2004]
20
Is user-input required?
  • Our visual system is proof that automatic methods are possible
    • classical image segmentation methods are automatic






  • Argument for user-directed methods?
    • only user knows desired scale/object of interest
21
Automatic graph cut [Shi & Malik]
  • Fully-connected graph
    • node for every pixel
    • link between every pair of pixels, p,q
    • cost cpq for each link
      • cpq measures similarity
        • similarity is inversely proportional to difference in color and position
22
Segmentation by Graph Cuts
  • Break Graph into Segments
    • Delete links that cross between segments
    • Easiest to break links that have low cost (similarity)
      • similar pixels should be in the same segments
      • dissimilar pixels should be in different segments

23
Cuts in a graph
  • Link Cut
    • set of links whose removal makes a graph disconnected
    • cost of a cut:
24
But min cut is not always the best cut...
25
Cuts in a graph
26
Interpretation as a Dynamical System
  • Treat the links as springs and shake the system
    • elasticity proportional to cost
    • vibration “modes” correspond to segments
      • can compute these by solving an eigenvector problem
      • http://www.cis.upenn.edu/~jshi/papers/pami_ncut.pdf
27
Interpretation as a Dynamical System
  • Treat the links as springs and shake the system
    • elasticity proportional to cost
    • vibration “modes” correspond to segments
      • can compute these by solving an eigenvector problem
      • http://www.cis.upenn.edu/~jshi/papers/pami_ncut.pdf
28
Color Image Segmentation
29
Extension to Soft Segmentation
30
Histogram-based segmentation
  • Goal
    • Break the image into K regions (segments)
    • Solve this by reducing the number of colors to K and mapping each pixel to the closest color
31
Histogram-based segmentation
  • Goal
    • Break the image into K regions (segments)
    • Solve this by reducing the number of colors to K and mapping each pixel to the closest color
32
Clustering
  • How to choose the representative colors?
    • This is a clustering problem!
33
Break it down into subproblems
  • Suppose I tell you the cluster centers ci
    • Q:  how to determine which points to associate with each ci?
34
K-means clustering
  • K-means clustering algorithm
    • Randomly initialize the cluster centers, c1, ..., cK
    • Given cluster centers, determine points in each cluster
      • For each point p, find the closest ci.  Put p into cluster i
    • Given points in each cluster, solve for ci
      • Set ci to be the mean of points in cluster i
    • If ci have changed, repeat Step 2


  • Java demo:  http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html


  • Properties
    • Will always converge to some solution
    • Can be a “local minimum”
      • does not always find the global minimum of objective function:

35
K-Means++
36
Probabilistic clustering
  • Basic questions
    • what’s the probability that a point x is in cluster m?
    • what’s the shape of each cluster?
  • K-means doesn’t answer these questions


  • Basic idea
    • instead of treating the data as a bunch of points, assume that they are all generated by sampling a continuous function
    • This function is called a generative model
      • defined by a vector of parameters θ

37
Mixture of Gaussians
  • One generative model is a mixture of Gaussians (MOG)
    • K Gaussian blobs with means μb covariance matrices Vb, dimension d
      • blob b defined by:


    • blob b is selected with probability
    • the likelihood of observing x is a weighted mixture of Gaussians



    • where
38
Expectation maximization (EM)
  • Goal
    • find blob parameters θ that maximize the likelihood function:

  • Approach:
    • E step:  given current guess of blobs, compute ownership of each point
    • M step:  given ownership probabilities, update blobs to maximize likelihood function
    • repeat until convergence


39
EM details
  • E-step
    • compute probability that point x is in blob i, given current guess of θ




  • M-step
    • compute probability that blob b is selected



    • mean of blob b



    • covariance of blob b
40
EM demo




  • http://lcn.epfl.ch/tutorial/english/gaussian/html/index.html


41
Applications of EM
  • Turns out this is useful for all sorts of problems
    • any clustering problem
    • any model estimation problem
    • missing data problems
    • finding outliers
    • segmentation problems
      • segmentation based on color
      • segmentation based on motion
      • foreground/background separation
    • ...


42
Problems with EM
  • Local minima
    • k-means is NP-hard even with k=2


  • Need to know number of segments
    • solutions: AIC, BIC, Dirichlet process mixture


  • Need to choose generative model




43
Finding Modes in a Histogram
  • How Many Modes Are There?
    • Easy to see, hard to compute


44
Mean Shift [Comaniciu & Meer]
  • Iterative Mode Search
    • Initialize random seed, and window W
    • Calculate center of gravity (the “mean”) of W:
    • Translate the search window to the mean
    • Repeat Step 2 until convergence
45
Mean-Shift
  • Approach
    • Initialize a window around each point
    • See where it shifts—this determines which segment it’s in
    • Multiple points will shift to the same segment
46
Mean-shift for image segmentation
  • Useful to take into account spatial information
    • instead of (R, G, B), run in (R, G, B, x, y) space
    • D. Comaniciu, P. Meer, Mean shift analysis and applications, 7th International Conference on Computer Vision, Kerkyra, Greece, September 1999, 1197-1203.
      • http://www.caip.rutgers.edu/riul/research/papers/pdf/spatmsft.pdf
47
Choosing Exemplars (Medoids)
48
Taxonomy of Segmentation Methods
49
References

    • Mortensen and Barrett, “Intelligent Scissors for Image Composition,” Proc. SIGGRAPH 1995.
    • Boykov and Jolly, “Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D images,” Proc. ICCV, 2001.
    • Shi and Malik, “Normalized Cuts and Image Segmentation,” Proc. CVPR 1997.
    • Comaniciu and Meer, “Mean shift analysis and applications,” Proc. ICCV 1999.