Announcements

Artifact due Thursday

everything must be in by Friday (regardless of late days)

Final exam: Tuesday, March 19, 2:30-4:20, MGH 228

comprehensive, but emphasis on material since midterm

closed notes

will review course topics on Friday

Recognition

Readings

C. Bishop, “Neural Networks for Pattern Recognition”, Oxford University Press, 1998, Chapter 1. (handout)

Forsyth and Ponce, pp. 723-729 (eigenfaces)

Recognition problems

What is it?

Object detection

Who is it?

Recognizing identity

What are they doing?

Activities

All of these are classification problems

Choose one class from a list of possible candidates

Face detection

How to tell if a face is present?

One simple method: skin detection

Skin pixels have a distinctive range of colors

Corresponds to region(s) in RGB color space

for visualization, only R and G components are shown above

Skin detection

Learn the skin region from examples

Manually label pixels in one or more “training images” as skin or not skin

Plot the training data in RGB space

skin pixels shown in orange, non-skin pixels shown in blue

some skin pixels may be outside the region, non-skin pixels inside. Why?

Skin classification techniques

Probability

Basic probability

X is a random variable

P(X) is the probability that X achieves a certain value

or

Conditional probability: P(X | Y)

probability of X given that we already know Y

Probabilistic skin classification

Now we can model uncertainty

Each pixel has a probability of being skin or not skin

Learning conditional PDF’s

We can calculate P(R | skin) from a set of training images

It is simply a histogram over the pixels in the training images

each bin R_i contains the proportion of skin pixels with color R_i

Bayes rule

In terms of our problem:

Bayesian estimation

Bayesian estimation

Goal is to choose the label (skin or ~skin) that maximizes the posterior

this is called Maximum A Posteriori (MAP) estimation

Skin detection results

General classification

This same procedure applies in more general circumstances

More than two classes

More than one dimension

Linear subspaces

Classification is still expensive

Must either search (e.g., nearest neighbors) or store large PDF’s

Dimensionality reduction

Linear subspaces

Principle component analysis

Suppose each data point is N-dimensional

Same procedure applies:

The eigenvectors of A define a new coordinate system

eigenvector with largest eigenvalue captures the most variation among training vectors x

eigenvector with smallest eigenvalue has least variation

We can compress the data by only using the top few eigenvectors

corresponds to choosing a “linear subspace”

represent points on a line, plane, or “hyper-plane”

The space of faces

An image is a point in a high dimensional space

An N x M image is a point in R^NM

We can define vectors in this space as we did in the 2D case

Dimensionality reduction

The set of faces is a “subspace” of the set of images

Suppose it is K dimensional

We can find the best subspace using PCA

This is like fitting a “hyper-plane” to the set of faces

spanned by vectors v₁, v₂, ..., v_K

any face x » a₁v₁+ a₂v₂+ , ..., + a_Kv_K

Eigenfaces

PCA extracts the eigenvectors of A

Gives a set of vectors v₁, v₂, v₃, ...

Each one of these vectors is a direction in face space

what do these look like?

Projecting onto the eigenfaces

The eigenfaces v₁, ..., v_K span the space of faces

A face is converted to eigenface coordinates by

Recognition with eigenfaces

Algorithm

Process the image database (set of images with labels)

Run PCA—compute eigenfaces

Calculate the K coefficients for each image

Given a new image (to be recognized) x, calculate K coefficients

Detect if x is a face

If it is a face, who is it?

Object recognition

This is just the tip of the iceberg

We’ve talked about using pixel color as a feature

Many other features can be used:

edges

motion (e.g., optical flow)

object size

...

Classical object recognition techniques recover 3D information as well

given an image and a database of 3D models, determine which model(s) appears in that image

often recover 3D pose of the object as well

Summary

Things to take away from this lecture

Classifiers

Probabilistic classification

decision boundaries

learning PDF’s from training images

Bayesian estimation

Principle component analysis

Eigenfaces algorithm


		Artifact due Thursday
			everything must be in by Friday (regardless of late days)
		Final exam: Tuesday, March 19, 2:30-4:20, MGH 228
			comprehensive, but emphasis on material since midterm
			closed notes
			will review course topics on Friday


	Readings
		C. Bishop, “Neural Networks for Pattern Recognition”, Oxford University Press, 1998, Chapter 1. (handout)
		Forsyth and Ponce, pp. 723-729 (eigenfaces)


	What is it?
		Object detection

	Who is it?
		Recognizing identity

	What are they doing?
		Activities

	All of these are classification problems
		Choose one class from a list of possible candidates


Skin pixels have a distinctive range of colors
	Corresponds to region(s) in RGB color space
		for visualization, only R and G components are shown above


Learn the skin region from examples
	Manually label pixels in one or more “training images” as skin or not skin
	Plot the training data in RGB space
		skin pixels shown in orange, non-skin pixels shown in blue
		some skin pixels may be outside the region, non-skin pixels inside. Why?


Basic probability
	X is a random variable
	P(X) is the probability that X achieves a certain value








	or


	Conditional probability: P(X \| Y)
		probability of X given that we already know Y


	Now we can model uncertainty
		Each pixel has a probability of being skin or not skin


We can calculate P(R \| skin) from a set of training images
	It is simply a histogram over the pixels in the training images
		each bin R_i contains the proportion of skin pixels with color R_i


Bayesian estimation
	Goal is to choose the label (skin or ~skin) that maximizes the posterior
		this is called Maximum A Posteriori (MAP) estimation


	This same procedure applies in more general circumstances
		More than two classes
		More than one dimension


	Classification is still expensive
		Must either search (e.g., nearest neighbors) or store large PDF’s


Suppose each data point is N-dimensional
	Same procedure applies:



	The eigenvectors of A define a new coordinate system
		eigenvector with largest eigenvalue captures the most variation among training vectors x
		eigenvector with smallest eigenvalue has least variation
	We can compress the data by only using the top few eigenvectors
		corresponds to choosing a “linear subspace”
			represent points on a line, plane, or “hyper-plane”


	An image is a point in a high dimensional space
		An N x M image is a point in R^NM
		We can define vectors in this space as we did in the 2D case


The set of faces is a “subspace” of the set of images
	Suppose it is K dimensional
	We can find the best subspace using PCA
	This is like fitting a “hyper-plane” to the set of faces
		spanned by vectors v₁, v₂, ..., v_K
		any face x » a₁v₁+ a₂v₂+ , ..., + a_Kv_K


PCA extracts the eigenvectors of A
	Gives a set of vectors v₁, v₂, v₃, ...
	Each one of these vectors is a direction in face space
		what do these look like?


	The eigenfaces v₁, ..., v_K span the space of faces
		A face is converted to eigenface coordinates by


Algorithm
	Process the image database (set of images with labels)
		Run PCA—compute eigenfaces
		Calculate the K coefficients for each image
	Given a new image (to be recognized) x, calculate K coefficients
	Detect if x is a face


	If it is a face, who is it?


This is just the tip of the iceberg
	We’ve talked about using pixel color as a feature
	Many other features can be used:
		edges
		motion (e.g., optical flow)
		object size
		...
	Classical object recognition techniques recover 3D information as well
		given an image and a database of 3D models, determine which model(s) appears in that image
		often recover 3D pose of the object as well


Things to take away from this lecture
	Classifiers
	Probabilistic classification
		decision boundaries
		learning PDF’s from training images
		Bayesian estimation
	Principle component analysis
	Eigenfaces algorithm