1
|
- Readings
- C. Bishop, “Neural Networks for Pattern Recognition”, Oxford University
Press, 1998, Chapter 1.
- Forsyth and Ponce, Chap 22.3 (through 22.3.2--eigenfaces)
|
2
|
|
3
|
- What is it?
- Who is it?
- What are they doing?
- All of these are classification problems
- Choose one class from a list of possible candidates
|
4
|
- How to tell if a face is present?
|
5
|
- Skin pixels have a distinctive range of colors
- Corresponds to region(s) in RGB color space
- for visualization, only R and G components are shown above
|
6
|
- Learn the skin region from examples
- Manually label pixels in one or more “training images” as skin or not
skin
- Plot the training data in RGB space
- skin pixels shown in orange, non-skin pixels shown in blue
- some skin pixels may be outside the region, non-skin pixels
inside. Why?
|
7
|
|
8
|
- Basic probability
- X is a random variable
- P(X) is the probability that X achieves a certain value
-
or
- Conditional probability: P(X |
Y)
- probability of X given that we already know Y
|
9
|
- Now we can model uncertainty
- Each pixel has a probability of being skin or not skin
|
10
|
- We can calculate P(R | skin) from a set of training images
- It is simply a histogram over the pixels in the training images
- each bin Ri contains the proportion of skin pixels with
color Ri
|
11
|
- We can calculate P(R | skin) from a set of training images
- It is simply a histogram over the pixels in the training images
- each bin Ri contains the proportion of skin pixels with
color Ri
|
12
|
|
13
|
- Bayesian estimation
- Goal is to choose the label (skin or ~skin) that maximizes the
posterior
- this is called Maximum A Posteriori (MAP) estimation
|
14
|
- Bayesian estimation
- Goal is to choose the label (skin or ~skin) that maximizes the
posterior
- this is called Maximum A Posteriori (MAP) estimation
|
15
|
|
16
|
- This same procedure applies in more general circumstances
- More than two classes
- More than one dimension
|
17
|
- Classification can be expensive
- Big search prob (e.g., nearest neighbors) or store large PDF’s
|
18
|
|
19
|
|
20
|
- Suppose each data point is N-dimensional
- Same procedure applies:
- The eigenvectors of A define a new coordinate system
- eigenvector with largest eigenvalue captures the most variation among
training vectors x
- eigenvector with smallest eigenvalue has least variation
- We can compress the data by only using the top few eigenvectors
- corresponds to choosing a “linear subspace”
- represent points on a line, plane, or “hyper-plane”
- these eigenvectors are known as the principal components
|
21
|
- An image is a point in a high dimensional space
- An N x M image is a point in RNM
- We can define vectors in this space as we did in the 2D case
|
22
|
- The set of faces is a “subspace” of the set of images
- Suppose it is K dimensional
- We can find the best subspace using PCA
- This is like fitting a “hyper-plane” to the set of faces
- spanned by vectors v1, v2, ..., vK
- any face
|
23
|
- PCA extracts the eigenvectors of A
- Gives a set of vectors v1, v2, v3, ...
- Each one of these vectors is a direction in face space
|
24
|
- The eigenfaces v1, ..., vK span the space of faces
- A face is converted to eigenface coordinates by
|
25
|
- Algorithm
- Process the image database (set of images with labels)
- Run PCA—compute eigenfaces
- Calculate the K coefficients for each image
- Given a new image (to be recognized) x, calculate K coefficients
- Detect if x is a face
- If it is a face, who is it?
|
26
|
- How many eigenfaces to use?
- Look at the decay of the eigenvalues
- the eigenvalue tells you the amount of variance “in the direction” of
that eigenface
- ignore eigenfaces with low variance
|
27
|
- This is just the tip of the iceberg
- We’ve talked about using pixel color as a feature
- Many other features can be used:
- edges
- motion
- object size
- SIFT
- ...
- Classical object recognition techniques recover 3D information as well
- given an image and a database of 3D models, determine which model(s)
appears in that image
- often recover 3D pose of the object as well
- Recognition is a very active research area right now
|