|
1
|
- Today’s Readings
- Forsyth & Ponce, Chapter 7
- (plus lots of optional references in the slides)
|
|
2
|
- What Defines an Object?
- Subjective problem, but has been well-studied
- Gestalt Laws seek to formalize this
- proximity, similarity, continuation, closure, common fate
- see notes by Steve Joordens, U. Toronto
|
|
3
|
|
|
4
|
- Many approaches proposed
- cues: color, regions, contours
- automatic vs. user-guided
- no clear winner
- we’ll consider several approaches today
|
|
5
|
|
|
6
|
- Approach answers a basic question
- Q: how to find a path from seed
to mouse that follows object boundary as closely as possible?
|
|
7
|
- Basic Idea
- Define edge score for each pixel
- edge pixels have low cost
- Find lowest cost path from seed to mouse
|
|
8
|
- Graph Search Algorithm
- Computes minimum cost path from seed to all other pixels
|
|
9
|
- Treat the image as a graph
|
|
10
|
- Treat the image as a graph
|
|
11
|
- c can be computed using a cross-correlation filter
- assume it is centered at p
- Also typically scale c by its length
- set c = (max-|filter response|)
- where max = maximum |filter response| over all pixels in the image
|
|
12
|
- c can be computed using a cross-correlation filter
- assume it is centered at p
- Also typically scale c by its length
- set c = (max-|filter response|)
- where max = maximum |filter response| over all pixels in the image
|
|
13
|
|
|
14
|
|
|
15
|
|
|
16
|
|
|
17
|
- Properties
- It computes the minimum cost path from the seed to every node in the
graph. This set of minimum paths
is represented as a tree
- Running time, with N pixels:
- O(N2) time if you use an active list
- O(N log N) if you use an active priority queue (heap)
- takes fraction of a second for a typical (640x480) image
- Once this tree is computed once, we can extract the optimal path from
any point to the seed in O(N) time.
- it runs in real time as the mouse moves
- What happens when the user specifies a new seed?
|
|
18
|
- Graph
- node for each pixel, link between pixels
- specify a few pixels as foreground and background
- create an infinite cost link from each bg pixel to the “t” node
- create an infinite cost link from each fg pixel to the “s” node
- compute min cut that separates s from t
- how to define link cost between neighboring pixels?
|
|
19
|
|
|
20
|
- Our visual system is proof that automatic methods are possible
- classical image segmentation methods are automatic
- Argument for user-directed methods?
- only user knows desired scale/object of interest
|
|
21
|
- Fully-connected graph
- node for every pixel
- link between every pair of pixels, p,q
- cost cpq for each link
- cpq measures similarity
- similarity is inversely proportional to difference in color and
position
|
|
22
|
- Break Graph into Segments
- Delete links that cross between segments
- Easiest to break links that have low cost (similarity)
- similar pixels should be in the same segments
- dissimilar pixels should be in different segments
|
|
23
|
- Link Cut
- set of links whose removal makes a graph disconnected
- cost of a cut:
|
|
24
|
|
|
25
|
|
|
26
|
- Treat the links as springs and shake the system
- elasticity proportional to cost
- vibration “modes” correspond to segments
- can compute these by solving an eigenvector problem
- http://www.cis.upenn.edu/~jshi/papers/pami_ncut.pdf
|
|
27
|
- Treat the links as springs and shake the system
- elasticity proportional to cost
- vibration “modes” correspond to segments
- can compute these by solving an eigenvector problem
- http://www.cis.upenn.edu/~jshi/papers/pami_ncut.pdf
|
|
28
|
|
|
29
|
|
|
30
|
- Goal
- Break the image into K regions (segments)
- Solve this by reducing the number of colors to K and mapping each pixel
to the closest color
|
|
31
|
- Goal
- Break the image into K regions (segments)
- Solve this by reducing the number of colors to K and mapping each pixel
to the closest color
|
|
32
|
- How to choose the representative colors?
- This is a clustering problem!
|
|
33
|
- Suppose I tell you the cluster centers ci
- Q: how to determine which points
to associate with each ci?
|
|
34
|
- K-means clustering algorithm
- Randomly initialize the cluster centers, c1, ..., cK
- Given cluster centers, determine points in each cluster
- For each point p, find the closest ci. Put p into cluster i
- Given points in each cluster, solve for ci
- Set ci to be the mean of points in cluster i
- If ci have changed, repeat Step 2
- Java demo: http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html
- Properties
- Will always converge to some solution
- Can be a “local minimum”
- does not always find the global minimum of objective function:
|
|
35
|
|
|
36
|
- Basic questions
- what’s the probability that a point x is in cluster m?
- what’s the shape of each cluster?
- K-means doesn’t answer these questions
- Basic idea
- instead of treating the data as a bunch of points, assume that they are
all generated by sampling a continuous function
- This function is called a generative model
- defined by a vector of parameters θ
|
|
37
|
- One generative model is a mixture of Gaussians (MOG)
- K Gaussian blobs with means μb covariance matrices Vb,
dimension d
- blob b is selected with probability
- the likelihood of observing x is a weighted mixture of Gaussians
- where
|
|
38
|
- Goal
- find blob parameters θ that maximize the likelihood function:
- Approach:
- E step: given current guess of
blobs, compute ownership of each point
- M step: given ownership
probabilities, update blobs to maximize likelihood function
- repeat until convergence
|
|
39
|
- E-step
- compute probability that point x is in blob i, given current guess of θ
- M-step
- compute probability that blob b is selected
- mean of blob b
- covariance of blob b
|
|
40
|
- http://lcn.epfl.ch/tutorial/english/gaussian/html/index.html
|
|
41
|
- Turns out this is useful for all sorts of problems
- any clustering problem
- any model estimation problem
- missing data problems
- finding outliers
- segmentation problems
- segmentation based on color
- segmentation based on motion
- foreground/background separation
- ...
|
|
42
|
- Local minima
- k-means is NP-hard even with k=2
- Need to know number of segments
- solutions: AIC, BIC, Dirichlet process mixture
- Need to choose generative model
|
|
43
|
- How Many Modes Are There?
- Easy to see, hard to compute
|
|
44
|
- Iterative Mode Search
- Initialize random seed, and window W
- Calculate center of gravity (the “mean”) of W:
- Translate the search window to the mean
- Repeat Step 2 until convergence
|
|
45
|
- Approach
- Initialize a window around each point
- See where it shifts—this determines which segment it’s in
- Multiple points will shift to the same segment
|
|
46
|
- Useful to take into account spatial information
- instead of (R, G, B), run in (R, G, B, x, y) space
- D. Comaniciu, P. Meer, Mean shift analysis and applications, 7th
International Conference on Computer Vision, Kerkyra, Greece, September
1999, 1197-1203.
- http://www.caip.rutgers.edu/riul/research/papers/pdf/spatmsft.pdf
|
|
47
|
|
|
48
|
|
|
49
|
- Mortensen and Barrett, “Intelligent Scissors for Image Composition,”
Proc. SIGGRAPH 1995.
- Boykov and Jolly, “Interactive Graph Cuts for Optimal Boundary &
Region Segmentation of Objects in N-D images,” Proc. ICCV, 2001.
- Shi and Malik, “Normalized Cuts and Image Segmentation,” Proc. CVPR
1997.
- Comaniciu and Meer, “Mean shift analysis and applications,” Proc. ICCV 1999.
|