|
1
|
- Project 2 due tomorrow night
- Project 3 out Wednesday
- Jiun-Hung will take photos beginning of class
|
|
2
|
|
|
3
|
|
|
4
|
|
|
5
|
|
|
6
|
|
|
7
|
|
|
8
|
|
|
9
|
|
|
10
|
|
|
11
|
|
|
12
|
|
|
13
|
|
|
14
|
|
|
15
|
|
|
16
|
|
|
17
|
|
|
18
|
|
|
19
|
|
|
20
|
- Today
- skin detection
- eigenfaces
- face detection with adaboost
|
|
21
|
- How to tell if a face is present?
|
|
22
|
- Skin pixels have a distinctive range of colors
- Corresponds to region(s) in RGB color space
- for visualization, only R and G components are shown above
|
|
23
|
- Learn the skin region from examples
- Manually label pixels in one or more “training images” as skin or not
skin
- Plot the training data in RGB space
- skin pixels shown in orange, non-skin pixels shown in blue
- some skin pixels may be outside the region, non-skin pixels
inside. Why?
|
|
24
|
|
|
25
|
- Basic probability
- X is a random variable
- P(X) is the probability that X achieves a certain value
-
or
- Conditional probability: P(X |
Y)
- probability of X given that we already know Y
|
|
26
|
- Now we can model uncertainty
- Each pixel has a probability of being skin or not skin
|
|
27
|
- We can calculate P(R | skin) from a set of training images
- It is simply a histogram over the pixels in the training images
- each bin Ri contains the proportion of skin pixels with
color Ri
|
|
28
|
- We can calculate P(R | skin) from a set of training images
- It is simply a histogram over the pixels in the training images
- each bin Ri contains the proportion of skin pixels with
color Ri
|
|
29
|
|
|
30
|
- Bayesian estimation
- Goal is to choose the label (skin or ~skin) that maximizes the
posterior
- this is called Maximum A Posteriori (MAP) estimation
|
|
31
|
|
|
32
|
- This same procedure applies in more general circumstances
- More than two classes
- More than one dimension
|
|
33
|
- Classification can be expensive
- Must either search (e.g., nearest neighbors) or store large PDF’s
|
|
34
|
|
|
35
|
|
|
36
|
- Suppose each data point is N-dimensional
- Same procedure applies:
- The eigenvectors of A define a new coordinate system
- eigenvector with largest eigenvalue captures the most variation among
training vectors x
- eigenvector with smallest eigenvalue has least variation
- We can compress the data by only using the top few eigenvectors
- corresponds to choosing a “linear subspace”
- represent points on a line, plane, or “hyper-plane”
- these eigenvectors are known as the principal components
|
|
37
|
- An image is a point in a high dimensional space
- An N x M image is a point in RNM
- We can define vectors in this space as we did in the 2D case
|
|
38
|
- The set of faces is a “subspace” of the set of images
- Suppose it is K dimensional
- We can find the best subspace using PCA
- This is like fitting a “hyper-plane” to the set of faces
- spanned by vectors v1, v2, ..., vK
- any face
|
|
39
|
- PCA extracts the eigenvectors of A
- Gives a set of vectors v1, v2, v3, ...
- Each one of these vectors is a direction in face space
|
|
40
|
- The eigenfaces v1, ..., vK span the space of faces
- A face is converted to eigenface coordinates by
|
|
41
|
- Algorithm
- Process the image database (set of images with labels)
- Run PCA—compute eigenfaces
- Calculate the K coefficients for each image
- Given a new image (to be recognized) x, calculate K coefficients
- Detect if x is a face
- If it is a face, who is it?
|
|
42
|
- How many eigenfaces to use?
- Look at the decay of the eigenvalues
- the eigenvalue tells you the amount of variance “in the direction” of
that eigenface
- ignore eigenfaces with low variance
|
|
43
|
- What’s the best way to compare images?
- need to define appropriate features
- depends on goal of recognition task
|
|
44
|
- Lots more feature types that we haven’t mentioned
- moments, statistics
- metrics: Earth mover’s
distance, ...
- edges, curves
- metrics: Hausdorff, shape
context, ...
- 3D: surfaces, spin images
- ...
|
|
45
|
|
|
46
|
- Generative methods
- model the “shape” of each class
- histograms, PCA, mixtures of Gaussians
- graphical models (HMM’s, belief networks, etc.)
- ...
- Discriminative methods
- model boundaries between classes
- perceptrons, neural networks
- support vector machines (SVM’s)
|
|
47
|
|
|
48
|
- What if your space isn’t flat?
|
|
49
|
- Case study: Viola Jones face
detector
- Next few slides adapted Grauman & Liebe’s tutorial
- http://www.vision.ee.ethz.ch/~bleibe/teaching/tutorial-aaai08/
- Also see Paul Viola’s talk (video)
- http://www.cs.washington.edu/education/courses/577/04sp/contents.html#DM
|
|
50
|
|
|
51
|
|
|
52
|
- Want to select the single rectangle feature and threshold that best
separates positive (faces) and negative (non-faces) training examples,
in terms of weighted error.
|
|
53
|
|
|
54
|
|
|
55
|
|
|
56
|
|
|
57
|
|
|
58
|
|
|
59
|
|
|
60
|
|
|
61
|
|
|
62
|
|
|
63
|
|
|
64
|
|