Global Alignment and Structure from Motion
CSE576, Spring 2009
Sameer Agarwal

Overview
Global refinement for Image stitching
Camera calibration
Pose estimation and Triangulation
Structure from Motion

Readings
Chapter 3, Noah Snavely’s thesis
Supplementary readings:
 Hartley & Zisserman, Multiview Geometry, Appendices 5 and 6.
Brown & Lowe, Recognizing Panoramas, ICCV 2003

Problem: Drift

Global optimization
Minimize a global energy function:
What are the variables?
The translation tj = (xj, yj) for each image
What is the objective function?
We have a set of matched features pi,j = (ui,j, vi,j)
For each point match (pi,j, pi,j+1):
pi,j+1 – pi,j = tj+1 – tj

Global optimization

Global optimization

Global optimization

Ambiguity in global location
Each of these solutions has the same error
Called the gauge ambiguity
Solution: fix the position of one image (e.g., make the origin of the 1st image (0,0))

Solving for rotations

Solving for rotations

Parameterizing rotations
How do we parameterize R and ΔR?
Euler angles:  bad idea
quaternions: 4-vectors on unit sphere
Axis-angle representation (Rodriguez Formula)

Nonlinear Least Squares

Camera Calibration

Camera calibration
Determine camera parameters from known 3D points or calibration object(s)
internal or intrinsic parameters such as focal length, optical center, aspect ratio:
what kind of camera?
external or extrinsic (pose)
parameters:
where is the camera?
How can we do this?

Camera calibration – approaches
Possible approaches:
linear regression (least squares)
non-linear optimization
vanishing points
multiple planar patterns
panoramas (rotational motion)

Image formation equations

Calibration matrix
Is this form of K good enough?
non-square pixels (digital video)
skew
radial distortion

Camera matrix
Fold intrinsic calibration matrix K and extrinsic pose parameters (R,t) together into a
camera matrix
M = K [R | t ]
(put 1 in lower r.h. corner for 11 d.o.f.)

Camera matrix calibration
Directly estimate 11 unknowns in the M matrix using known 3D points (Xi,Yi,Zi) and measured feature positions (ui,vi)

Camera matrix calibration
Linear regression:
Bring denominator over, solve set of (over-determined) linear equations.  How?
Least squares (pseudo-inverse)
Is this good enough?

Camera matrix calibration
Advantages:
very simple to formulate and solve
can recover K [R | t] from M using QR decomposition [Golub & VanLoan 96]
Disadvantages:
doesn't compute internal parameters
can give garbage results
more unknowns than true degrees of freedom
need a separate camera matrix for each new view

Multi-plane calibration
Use several images of planar target held at unknown orientations [Zhang 99]
Compute plane homographies
Solve for K-TK-1 from Hk’s
1 plane if only f unknown
2 planes if (f,uc,vc) unknown
3+ planes for full K
Code available from Zhang and OpenCV

Pose estimation and triangulation

Pose estimation
Use inter-point distance constraints
[Quan 99][Ameller 00]
Solve set of polynomial equations in xi2p
Recover R,t using procrustes analysis.

Triangulation
Problem:  Given some points in correspondence across two or more images (taken from calibrated cameras), {(uj,vj)}, compute the 3D location X

Triangulation
Method I: intersect viewing rays in 3D, minimize:
X is the unknown 3D point
Cj is the optical center of camera j
Vj is the viewing ray for pixel (uj,vj)
sj is unknown distance along Vj
Advantage: geometrically intuitive

Triangulation
Method II: solve linear equations in X
advantage: very simple
Method III: non-linear minimization
advantage: most accurate (image plane error)

Structure from Motion

Structure from motion
Given many points in correspondence across several images, {(uij,vij)}, simultaneously compute the 3D location xi and camera (or motion) parameters (K, Rj, tj)
Two main variants: calibrated, and uncalibrated (sometimes associated with Euclidean and projective reconstructions)

Orthographic SFM
[Tomasi & Kanade, IJCV 92]

Extensions
Paraperspective
 [Poelman & Kanade, PAMI 97]
Sequential Factorization
 [Morita & Kanade, PAMI 97]
Factorization under perspective
 [Christy & Horaud, PAMI 96]
 [Sturm & Triggs, ECCV 96]
Factorization with Uncertainty
 [Anandan & Irani, IJCV 2002]

SfM objective function
Given point x and rotation and translation R, t
Minimize sum of squared reprojection errors:

Scene reconstruction

Feature detection

Feature detection

Feature detection
Detect features using SIFT [Lowe, IJCV 2004]

Feature matching
Match features between each pair of images

Feature matching

Reconstruction
Choose two/three views to seed the reconstruction.
Add 3d points via triangulation.
Add cameras using pose estimation.
Bundle adjustment
Goto step 2.

Two-view structure from motion
Simpler case: can consider motion independent of structure
Let’s first consider the case where K is known
Each image point (ui,j, vi,j, 1) can be multiplied by K-1 to form a 3D ray
We call this the calibrated case

Notes on two-view geometry
How can we express the epipolar constraint?
Answer: there is a 3x3 matrix E such that
                  p'TEp = 0
E is called the essential matrix

Properties of the essential matrix

Properties of the essential matrix
p'TEp = 0
Ep  is the epipolar line associated with p
e and e' are called epipoles: Ee = 0 and ETe' = 0
E can be solved for with 5 point matches
see Nister, An efficient solution to the five-point relative pose problem.  PAMI 2004.

The Fundamental matrix
If K is not known, then we use a related matrix called the Fundamental matrix, F
Called the uncalibrated case
F can be solved for linearly with eight points, or non-linearly with six or seven points

Photo Tourism overview