Image Alignment and
Stitching
|
|
|
Computer Vision
CSE576, Spring 2005
Richard Szeliski |
Today’s lecture
|
|
|
Image alignment and stitching |
|
motion models |
|
cylindrical and spherical
warping |
|
point-based alignment |
|
global alignment |
|
automated stitching
(recognizing panoramas) |
|
ghost and parallax removal |
|
compositing and blending |
Readings
|
|
|
Szeliski & Shum,
SIGGRAPH'97
(Sections 1-4). |
|
Szeliski, Image Alignment and
Stitching, MSR-TR-2004-92 (Sections 2, 4, 5). |
|
Recognizing Panoramas, Brown
& Lowe, ICCV’2003 |
Motion models
Motion models
|
|
|
What happens when we take two
images with a camera and try to align them? |
|
translation? |
|
rotation? |
|
scale? |
|
affine? |
|
perspective? |
|
… see interactive demo
(VideoMosaic) |
Motion models
Motion models
Homographies
|
|
|
|
|
Perspective projection of a
plane |
|
Lots of names for this: |
|
homography, texture-map,
colineation, planar projective map |
|
Modeled as a 2D warp using
homogeneous coordinates |
Plane perspective mosaics
|
|
|
|
|
8-parameter generalization of
affine motion |
|
works for pure rotation or
planar surfaces |
|
Limitations: |
|
local minima |
|
slow convergence |
|
difficult to control
interactively |
Rotational mosaics
|
|
|
|
|
Directly optimize rotation and
focal length |
|
Advantages: |
|
ability to build full-view
panoramas |
|
easier to control interactively |
|
more stable and accurate
estimates |
3D → 2D Perspective
Projection
3D Rotation Model
|
|
|
Projection equations |
|
Project from image to 3D ray |
|
(x0,y0,z0)
= (u0-uc,v0-vc,f) |
|
Rotate the ray by camera motion |
|
(x1,y1,z1)
= R01 (x0,y0,z0) |
|
Project back into new (source)
image |
|
(u1,v1) =
(fx1/z1+uc,fy1/z1+vc) |
|
|
Image Mosaics (Stitching)
|
|
|
[Szeliski & Shum,
SIGGRAPH’97] |
|
[Szeliski, MSR-TR-2004-92] |
Image Mosaics (Stitching)
Image Mosaics (stitching)
|
|
|
Blend together several
overlapping images into one seamless mosaic (composite)
+ +
… + = |
Mosaics for Video Coding
|
|
|
Convert masked images into a
background sprite for content-based coding |
|
|
|
+ +
+ |
|
|
|
= |
|
|
Establishing
correspondences
|
|
|
|
Direct method: |
|
Use generalization of affine
motion model
[Szeliski & Shum ’97] |
|
Feature-based method |
|
Compute feature-based
correspondence
[Lowe ICCV’99; Schmid ICCV’98,
Brown&Lowe ICCV’2003] |
|
Compute R from
correspondences
(absolute orientation) |
Stitching demo
Panoramas
|
|
|
What if you want a 360° field of view? |
Cylindrical panoramas
|
|
|
|
Steps |
|
Reproject each image onto a
cylinder |
|
Blend |
|
Output the resulting mosaic |
Cylindrical Panoramas
|
|
|
|
Map image to cylindrical or
spherical coordinates |
|
need known focal length |
|
|
|
|
|
|
|
|
Cylindrical projection
|
|
|
|
Map 3D point (X,Y,Z) onto
cylinder |
Cylindrical warping
|
|
|
Given focal length f and image
center (xc,yc) |
Spherical warping
|
|
|
Given focal length f and image
center (xc,yc) |
3D rotation
|
|
|
Rotate image before placing on
unrolled sphere |
Radial distortion
|
|
|
Correct for “bending” in wide
field of view lenses |
|
|
Fisheye lens
|
|
|
Extreme “bending” in ultra-wide
fields of view |
|
|
Inverse Warping
|
|
|
Get each pixel I0(u0)
from its corresponding location u1 = h(u0) in I1(u1) |
Image Stitching
|
|
|
|
Align the images over each
other |
|
camera pan ↔ translation
on cylinder! |
|
Blend the images together (demo) |
Project 2 – image
stitching
|
|
|
Take pictures on a tripod (or
handheld) |
|
Warp images to spherical
coordinates |
|
Extract features |
|
Align neighboring pairs using
RANSAC |
|
Write out list of neighboring
translations |
|
Correct for drift |
|
Read in warped images and blend
them |
|
Crop the result and import into
a viewer |
Matching features
RAndom SAmple Consensus
RAndom SAmple Consensus
Least squares fit
Assembling the panorama
|
|
|
Stitch pairs together, blend,
then crop |
Problem: Drift
|
|
|
|
Error accumulation |
|
small (vertical) errors
accumulate over time |
|
apply correction so that sum =
0 (for 360° pan.) |
Full-view (360°
spherical) panoramas
Full-view Panoramas
Global alignment
|
|
|
Register all pairwise
overlapping images |
|
Use a 3D rotation model (one R
per image) |
|
Use feature based registration
of unwarped images |
|
Discover which images overlap
other images using feature selection (RANSAC) |
|
Chain together inter-frame
rotations |
|
Optimize all R estimates
together (next time) |
3D Rotation Model
|
|
|
Projection equations |
|
Project from image to 3D ray |
|
(x0,y0,z0)
= (u0-uc,v0-vc,f) |
|
Rotate the ray by camera motion |
|
(x1,y1,z1)
= R01 (x0,y0,z0) |
|
Project back into new (source)
image |
|
(u1,v1) =
(fx1/z1+uc,fy1/z1+vc) |
|
|
Absolute orientation
|
|
|
[Arun et al., PAMI 1987] [Horn et
al., JOSA A 1988]
Procrustes Algorithm [Golub & VanLoan] |
|
|
|
Given two sets of matching
points, compute R |
|
pi’ = R pi with 3D rays |
|
pi = (xi,yi,zi)
= (ui-uc,vi-vc,f) |
|
A = Σi pi
pi’T = Σi pi piT
RT = U S VT = (U S UT) RT |
|
VT = UT RT |
|
R = V UT |
Stitching demo
Texture Mapped Model
(sphere)
Texture Mapped Model
(cubical)
Recognizing Panoramas
|
|
|
Matthew Brown & David Lowe |
|
ICCV’2003 |
Recognizing Panoramas
Finding the panoramas
Finding the panoramas
Finding the panoramas
Finding the panoramas
Fully automated 2D
stitching
Get you own copy!
System components
|
|
|
|
Feature detection and
description |
|
more uniform point density |
|
Fast matching (hash table) |
|
RANSAC filtering of matches |
|
Intensity-based verification |
|
Incremental bundle adjustment |
|
[Brown, Szeliski, Winder,
CVPR’05] |
Probabilistic Feature
Matching
RANSAC motion model
RANSAC motion model
RANSAC motion model
Probabilistic model for
verification
How well does this work?
|
|
|
Test on 100s of examples… |
How well does this work?
|
|
|
Test on 100s of examples… |
|
|
|
…still too many failures
(5-10%)
for consumer application |
Matching Mistakes: False
Positive
Matching Mistakes: False
Positive
Matching Mistakes: False
Negative
|
|
|
Moving objects: large areas of
disagreement |
|
|
Matching Mistakes
|
|
|
|
Accidental alignment |
|
repeated / similar regions |
|
Failed alignments |
|
moving objects / parallax |
|
low overlap |
|
“feature-less” regions
(more variety?) |
|
No 100% reliable algorithm? |
How can we fix these?
|
|
|
|
Tune the feature detector |
|
Tune the feature matcher (cost
metric) |
|
Tune the RANSAC stage (motion
model) |
|
Tune the verification stage |
|
Use “higher-level” knowledge |
|
e.g., typical camera motions |
|
→ Sounds like a big
“learning” problem |
|
Need a large training/test data
set (panoramas) |
Deghosting and blending
Local alignment
(deghosting)
|
|
|
Use local optic flow to
compensate for small motions [Shum & Szeliski, ICCV’98] |
Local alignment
(deghosting)
|
|
|
Use local optic flow to
compensate for radial distortion [Shum & Szeliski, ICCV’98] |
Image feathering
|
|
|
Weight each image proportional
to its distance from the edge
(distance map [Danielsson, CVGIP
1980] |
|
|
|
Cut out the appropriate region
from each image
and then blend together |
Region-based de-ghosting
|
|
|
Select only one image in regions-of-difference
using weighted vertex cover
[Uyttendaele et al., CVPR’01] |
Region-based de-ghosting
|
|
|
Select only one image in regions-of-difference
using weighted vertex cover
[Uyttendaele et al., CVPR’01] |
Cutout-based de-ghosting
|
|
|
Select only one image per
output pixel, using spatial continuity |
|
Blend across seams using
gradient continuity (“Poisson blending”)
[Agarwala et al., SG’2004] |
Cutout-based compositing
|
|
|
Photomontage [Agarwala et al.,
SG’2004] |
|
Interactively blend different
images:
group portraits |
Cutout-based compositing
|
|
|
Photomontage [Agarwala et al.,
SG’2004] |
|
Interactively blend different
images:
focus settings |
Cutout-based compositing
|
|
|
Photomontage [Agarwala et al.,
SG’2004] |
|
Interactively blend different
images:
people’s faces |
Final thought: What is a “panorama”?
|
|
|
Tracking a subject |
|
|
|
Repeated (best) shots |
|
|
|
Multiple exposures |
|
|
|
|
|
“Infer” what photographer
wants? |