Notes
Slide Show
Outline
1
Announcements
    • Project 1 artifact winners
      • not enough votes—please vote today!
2
Recognition
  • Readings
    • C. Bishop, “Neural Networks for Pattern Recognition”, Oxford University Press, 1998, Chapter 1.
    • Forsyth and Ponce, 22.3 (eigenfaces)
3
Recognition
  • Readings
    • C. Bishop, “Neural Networks for Pattern Recognition”, Oxford University Press, 1998, Chapter 1.
    • Forsyth and Ponce, 22.3 (eigenfaces)
4
Recognition problems
  • What is it?
    • Object detection

  • Who is it?
    • Recognizing identity

  • What are they doing?
    • Activities


  • All of these are classification problems
    • Choose one class from a list of possible candidates
5
Face detection
  • How to tell if a face is present?
6
One simple method:  skin detection
  • Skin pixels have a distinctive range of colors
    • Corresponds to region(s) in RGB color space
      • for visualization, only R and G components are shown above
7
Skin detection
  • Learn the skin region from examples
    • Manually label pixels in one or more “training images” as skin or not skin
    • Plot the training data in RGB space
      • skin pixels shown in orange, non-skin pixels shown in blue
      • some skin pixels may be outside the region, non-skin pixels inside.  Why?
8
Skin classification techniques
9
Probability
  • Basic probability
    • X is a random variable
    • P(X) is the probability that X achieves a certain value









    •                                     or



    • Conditional probability:   P(X | Y)
      • probability of X given that we already know Y
10
Probabilistic skin classification
  • Now we can model uncertainty
    • Each pixel has a probability of being skin or not skin
11
Learning conditional PDF’s
  • We can calculate P(R | skin) from a set of training images
    • It is simply a histogram over the pixels in the training images
      • each bin Ri contains the proportion of skin pixels with color Ri
12
Learning conditional PDF’s
  • We can calculate P(R | skin) from a set of training images
    • It is simply a histogram over the pixels in the training images
      • each bin Ri contains the proportion of skin pixels with color Ri
13
Bayes rule
  • In terms of our problem:
14
Bayesian estimation
  • Bayesian estimation
    • Goal is to choose the label (skin or ~skin) that maximizes the posterior
      • this is called Maximum A Posteriori (MAP) estimation
15
Skin detection results
16
General classification
  • This same procedure applies in more general circumstances
    • More than two classes
    • More than one dimension

17
Linear subspaces
  • Classification is still expensive
    • Must either search (e.g., nearest neighbors) or store large PDF’s
18
Dimensionality reduction
19
Linear subspaces
20
Principle component analysis
  • Suppose each data point is N-dimensional
    • Same procedure applies:




    • The eigenvectors of A define a new coordinate system
      • eigenvector with largest eigenvalue captures the most variation among training vectors x
      • eigenvector with smallest eigenvalue has least variation
    • We can compress the data by only using the top few eigenvectors
      • corresponds to choosing a “linear subspace”
        • represent points on a line, plane, or “hyper-plane”
21
The space of faces
  • An image is a point in a high dimensional space
    • An N x M image is a point in RNM
    • We can define vectors in this space as we did in the 2D case
22
Dimensionality reduction
  • The set of faces is a “subspace” of the set of images
    • Suppose it is K dimensional
    • We can find the best subspace using PCA
    • This is like fitting a “hyper-plane” to the set of faces
      • spanned by vectors v1, v2, ..., vK
      • any face
23
Eigenfaces
  • PCA extracts the eigenvectors of A
    • Gives a set of vectors v1, v2, v3, ...
    • Each one of these vectors is a direction in face space
      • what do these look like?
24
Projecting onto the eigenfaces
  • The eigenfaces v1, ..., vK span the space of faces
    • A face is converted to eigenface coordinates by
25
Recognition with eigenfaces
  • Algorithm
    • Process the image database (set of images with labels)
      • Run PCA—compute eigenfaces
      • Calculate the K coefficients for each image
    • Given a new image (to be recognized) x, calculate K coefficients
    • Detect if x is a face



    • If it is a face, who is it?
26
Limits of PCA
  • Attempts to fit a hyperplane to the data
    • can be interpreted as fitting a Gaussian, where A is the covariance matrix
    • this is not a good model for some data


  • If you know the model in advance, don’t use PCA
    • regression techniques to fit parameters of a model


  • Several alternatives/improvements to PCA have been developed
    • LLE:  http://www.cs.toronto.edu/~roweis/lle/
    • isomap:  http://isomap.stanford.edu/
    • kernel PCA:  http://www.cs.ucsd.edu/classes/fa01/cse291/kernelPCA_article.pdf
    • For a survey of such methods applied to object recognition
      • Moghaddam, B., "Principal Manifolds and Probabilistic Subspaces for Visual Recognition", IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), June 2002 (Vol 24, Issue 6, pps 780-788)
        http://www.merl.com/papers/TR2002-13/
27
Object recognition
  • This is just the tip of the iceberg
    • We’ve talked about using pixel color as a feature
    • Many other features can be used:
      • edges
      • motion (e.g., optical flow)
      • object size
      • ...
    • Classical object recognition techniques recover 3D information as well
      • given an image and a database of 3D models, determine which model(s) appears in that image
      • often recover 3D pose of the object as well
      • new work (e.g., Linda Shapiro’s group at UW), seeks to recognize 3D objects (meshes) by training on 3D scan data