| Project1 due Tuesday |
| Today’s Readings | |||
| Trucco & Verri, 8.3 – 8.4 (skip 8.3.3, read only top half of p. 199) | |||
| Supplemental: | |||
| R. Bergen, P. Anandan, K.J. Hanna, and R. Hingorani. Hierarchical model-based motion estimation. European Conf. on Computer Vision (ECCV), 1992 | |||
| http://www.cs.washington.edu/education/courses/576/03sp/readings/bergen_eccv92.pdf | |||
| Numerical Recipes (Newton-Raphson), 9.4 (first four pages) | |||
| http://www.ulib.org/webRoot/Books/Numerical_Recipes/bookcpdf/c9-4.pdf | |||
| Lots of uses | ||
| Track object behavior | ||
| Correct for camera jitter (stabilization) | ||
| Align images (mosaics) | ||
| 3D shape reconstruction | ||
| Special effects | ||
Problem definition: optical flow
| How to estimate pixel motion from image H to image I? |
Optical flow constraints (grayscale images)
| Let’s look at these constraints more closely |
| Combining these two equations |
| Q: how many unknowns and equations per pixel? |
| Basic idea: assume motion field is smooth | |||
| Horn & Schunk: add smoothness term | |||
| Lukas & Kanade: assume locally constant motion | |||
| pretend the pixel’s neighbors have the same (u,v) | |||
| If we use a 5x5 window, that gives us 25 equations per pixel! | |||
| works better in practice than Horn & Schunk | |||
| Many other methods exist. Here’s an overview: | |||
| Barron, J.L., Fleet, D.J., and Beauchemin, S, Performance of optical flow techniques, International Journal of Computer Vision, 12(1):43-77, 1994. | |||
| http://www.cs.washington.edu/education/courses/576/03sp/readings/barron92performance.pdf | |||
| How to get more equations for a pixel? | ||||
| Basic idea: impose additional constraints | ||||
| most common is to assume that the flow field is smooth locally | ||||
| one method: pretend the pixel’s neighbors have the same (u,v) | ||||
| If we use a 5x5 window, that gives us 25 equations per pixel! | ||||
| How to get more equations for a pixel? | ||||
| Basic idea: impose additional constraints | ||||
| most common is to assume that the flow field is smooth locally | ||||
| one method: pretend the pixel’s neighbors have the same (u,v) | ||||
| If we use a 5x5 window, that gives us 25*3 equations per pixel! | ||||
| Prob: we have more equations than unknowns |
| Optimal (u, v) satisfies Lucas-Kanade equation | ||
| This is a two image problem BUT | |||
| Can measure sensitivity by just looking at one of the images! | |||
| This tells us which pixels are easy to track, which are hard | |||
| very useful later on when we do feature tracking... | |||
| What are the potential causes of errors in this procedure? | ||
| Suppose ATA is easily invertible | ||
| Suppose there is not much noise in the image | ||
| Recall our small motion assumption |
Revisiting the small motion assumption
| Is this motion small enough? | ||
| Probably not—it’s much larger than one pixel (2nd order terms dominate) | ||
| How might we solve this problem? | ||
Coarse-to-fine optical flow estimation
Coarse-to-fine optical flow estimation
| L-K minimizes a sum-of-squares error metric | ||
| least squares techniques overly sensitive to outliers | ||
| Robust Horn & Schunk | |
| Robust Lukas & Kanade |
| Suppose we have more than two images | ||
| How to track a point through all of the images? | ||
| Feature tracking | ||
| Compute optical flow for that feature for each consecutive H, I | ||
| L-K requires small motion | |||
| If the motion is much more than a pixel, use discrete search instead | |||
| Given feature window W in H, find best matching window in I | |||
| Minimize sum squared difference (SSD) of pixels in window | |||
| Solve by doing a search over a specified range of (u,v) values | |||
| this (u,v) range defines the search window | |||
| Feature tracking with m frames | ||
| Select features in first frame | ||
| Given feature in frame i, compute position in i+1 | ||
| Select more features if needed | ||
| i = i + 1 | ||
| If i < m, go to step 2 | ||
| Idea | |||
| Can get better performance if we know something about the way points move | |||
| Most approaches assume constant velocity | |||
| or constant acceleration | |||
| Use above to predict position in next frame, initialize search | |||
| http://www.toulouse.ca/?/CamTracker/?/CamTracker/FeatureTracking.html | |
| Goal: estimate single (u,v) translation for entire image | ||
| Easier subcase: solvable by pyramid-based Lukas-Kanade | ||