1
|
- Project 1 artifact winners
- not enough votes—please vote today!
|
2
|
- Readings
- C. Bishop, “Neural Networks for Pattern Recognition”, Oxford University
Press, 1998, Chapter 1.
- Forsyth and Ponce, 22.3 (eigenfaces)
|
3
|
- Readings
- C. Bishop, “Neural Networks for Pattern Recognition”, Oxford University
Press, 1998, Chapter 1.
- Forsyth and Ponce, 22.3 (eigenfaces)
|
4
|
- What is it?
- Who is it?
- What are they doing?
- All of these are classification problems
- Choose one class from a list of possible candidates
|
5
|
- How to tell if a face is present?
|
6
|
- Skin pixels have a distinctive range of colors
- Corresponds to region(s) in RGB color space
- for visualization, only R and G components are shown above
|
7
|
- Learn the skin region from examples
- Manually label pixels in one or more “training images” as skin or not
skin
- Plot the training data in RGB space
- skin pixels shown in orange, non-skin pixels shown in blue
- some skin pixels may be outside the region, non-skin pixels
inside. Why?
|
8
|
|
9
|
- Basic probability
- X is a random variable
- P(X) is the probability that X achieves a certain value
-
or
- Conditional probability: P(X |
Y)
- probability of X given that we already know Y
|
10
|
- Now we can model uncertainty
- Each pixel has a probability of being skin or not skin
|
11
|
- We can calculate P(R | skin) from a set of training images
- It is simply a histogram over the pixels in the training images
- each bin Ri contains the proportion of skin pixels with
color Ri
|
12
|
- We can calculate P(R | skin) from a set of training images
- It is simply a histogram over the pixels in the training images
- each bin Ri contains the proportion of skin pixels with
color Ri
|
13
|
|
14
|
- Bayesian estimation
- Goal is to choose the label (skin or ~skin) that maximizes the
posterior
- this is called Maximum A Posteriori (MAP) estimation
|
15
|
|
16
|
- This same procedure applies in more general circumstances
- More than two classes
- More than one dimension
|
17
|
- Classification is still expensive
- Must either search (e.g., nearest neighbors) or store large PDF’s
|
18
|
|
19
|
|
20
|
- Suppose each data point is N-dimensional
- Same procedure applies:
- The eigenvectors of A define a new coordinate system
- eigenvector with largest eigenvalue captures the most variation among
training vectors x
- eigenvector with smallest eigenvalue has least variation
- We can compress the data by only using the top few eigenvectors
- corresponds to choosing a “linear subspace”
- represent points on a line, plane, or “hyper-plane”
|
21
|
- An image is a point in a high dimensional space
- An N x M image is a point in RNM
- We can define vectors in this space as we did in the 2D case
|
22
|
- The set of faces is a “subspace” of the set of images
- Suppose it is K dimensional
- We can find the best subspace using PCA
- This is like fitting a “hyper-plane” to the set of faces
- spanned by vectors v1, v2, ..., vK
- any face
|
23
|
- PCA extracts the eigenvectors of A
- Gives a set of vectors v1, v2, v3, ...
- Each one of these vectors is a direction in face space
|
24
|
- The eigenfaces v1, ..., vK span the space of faces
- A face is converted to eigenface coordinates by
|
25
|
- Algorithm
- Process the image database (set of images with labels)
- Run PCA—compute eigenfaces
- Calculate the K coefficients for each image
- Given a new image (to be recognized) x, calculate K coefficients
- Detect if x is a face
- If it is a face, who is it?
|
26
|
- Attempts to fit a hyperplane to the data
- can be interpreted as fitting a Gaussian, where A is the covariance
matrix
- this is not a good model for some data
- If you know the model in advance, don’t use PCA
- regression techniques to fit parameters of a model
- Several alternatives/improvements to PCA have been developed
- LLE: http://www.cs.toronto.edu/~roweis/lle/
- isomap: http://isomap.stanford.edu/
- kernel PCA: http://www.cs.ucsd.edu/classes/fa01/cse291/kernelPCA_article.pdf
- For a survey of such methods applied to object recognition
- Moghaddam, B., "Principal Manifolds and Probabilistic Subspaces
for Visual Recognition", IEEE Transactions on Pattern Analysis
and Machine Intelligence (PAMI), June 2002 (Vol 24, Issue 6, pps
780-788)
http://www.merl.com/papers/TR2002-13/
|
27
|
- This is just the tip of the iceberg
- We’ve talked about using pixel color as a feature
- Many other features can be used:
- edges
- motion (e.g., optical flow)
- object size
- ...
- Classical object recognition techniques recover 3D information as well
- given an image and a database of 3D models, determine which model(s)
appears in that image
- often recover 3D pose of the object as well
- new work (e.g., Linda Shapiro’s group at UW), seeks to recognize 3D
objects (meshes) by training on 3D scan data
|