Project1 due Tuesday |
Today’s Readings | |||
Trucco & Verri, 8.3 – 8.4 (skip 8.3.3, read only top half of p. 199) | |||
Supplemental: | |||
R. Bergen, P. Anandan, K.J. Hanna, and R. Hingorani. Hierarchical model-based motion estimation. European Conf. on Computer Vision (ECCV), 1992 | |||
http://www.cs.washington.edu/education/courses/576/03sp/readings/bergen_eccv92.pdf | |||
Numerical Recipes (Newton-Raphson), 9.4 (first four pages) | |||
http://www.ulib.org/webRoot/Books/Numerical_Recipes/bookcpdf/c9-4.pdf | |||
Lots of uses | ||
Track object behavior | ||
Correct for camera jitter (stabilization) | ||
Align images (mosaics) | ||
3D shape reconstruction | ||
Special effects |
Problem definition: optical flow
How to estimate pixel motion from image H to image I? |
Optical flow constraints (grayscale images)
Let’s look at these constraints more closely |
Combining these two equations |
Q: how many unknowns and equations per pixel? |
Basic idea: assume motion field is smooth | |||
Horn & Schunk: add smoothness term | |||
Lukas & Kanade: assume locally constant motion | |||
pretend the pixel’s neighbors have the same (u,v) | |||
If we use a 5x5 window, that gives us 25 equations per pixel! | |||
works better in practice than Horn & Schunk | |||
Many other methods exist. Here’s an overview: | |||
Barron, J.L., Fleet, D.J., and Beauchemin, S, Performance of optical flow techniques, International Journal of Computer Vision, 12(1):43-77, 1994. | |||
http://www.cs.washington.edu/education/courses/576/03sp/readings/barron92performance.pdf | |||
How to get more equations for a pixel? | ||||
Basic idea: impose additional constraints | ||||
most common is to assume that the flow field is smooth locally | ||||
one method: pretend the pixel’s neighbors have the same (u,v) | ||||
If we use a 5x5 window, that gives us 25 equations per pixel! |
How to get more equations for a pixel? | ||||
Basic idea: impose additional constraints | ||||
most common is to assume that the flow field is smooth locally | ||||
one method: pretend the pixel’s neighbors have the same (u,v) | ||||
If we use a 5x5 window, that gives us 25*3 equations per pixel! |
Prob: we have more equations than unknowns |
Optimal (u, v) satisfies Lucas-Kanade equation |
This is a two image problem BUT | |||
Can measure sensitivity by just looking at one of the images! | |||
This tells us which pixels are easy to track, which are hard | |||
very useful later on when we do feature tracking... |
What are the potential causes of errors in this procedure? | ||
Suppose ATA is easily invertible | ||
Suppose there is not much noise in the image | ||
Recall our small motion assumption |
Revisiting the small motion assumption
Is this motion small enough? | ||
Probably not—it’s much larger than one pixel (2nd order terms dominate) | ||
How might we solve this problem? |
Coarse-to-fine optical flow estimation
Coarse-to-fine optical flow estimation
L-K minimizes a sum-of-squares error metric | ||
least squares techniques overly sensitive to outliers |
Robust Horn & Schunk | |
Robust Lukas & Kanade |
Suppose we have more than two images | ||
How to track a point through all of the images? |
Feature tracking | ||
Compute optical flow for that feature for each consecutive H, I | ||
L-K requires small motion | |||
If the motion is much more than a pixel, use discrete search instead | |||
Given feature window W in H, find best matching window in I | |||
Minimize sum squared difference (SSD) of pixels in window | |||
Solve by doing a search over a specified range of (u,v) values | |||
this (u,v) range defines the search window |
Feature tracking with m frames | ||
Select features in first frame | ||
Given feature in frame i, compute position in i+1 | ||
Select more features if needed | ||
i = i + 1 | ||
If i < m, go to step 2 |
Idea | |||
Can get better performance if we know something about the way points move | |||
Most approaches assume constant velocity | |||
or constant acceleration | |||
Use above to predict position in next frame, initialize search |
http://www.toulouse.ca/?/CamTracker/?/CamTracker/FeatureTracking.html | |
Goal: estimate single (u,v) translation for entire image | ||
Easier subcase: solvable by pyramid-based Lukas-Kanade |