Global Alignment and
Structure from Motion
|
|
|
CSE576, Spring 2009 |
|
Sameer Agarwal |
Overview
|
|
|
Global refinement for Image
stitching |
|
Camera calibration |
|
Pose estimation and
Triangulation |
|
Structure from Motion |
Readings
|
|
|
|
Chapter 3, Noah Snavely’s
thesis |
|
|
|
Supplementary readings: |
|
Hartley & Zisserman, Multiview Geometry,
Appendices 5 and 6. |
|
Brown & Lowe, Recognizing
Panoramas, ICCV 2003 |
|
|
Problem: Drift
Global optimization
|
|
|
|
|
|
Minimize a global energy
function: |
|
What are the variables? |
|
The translation tj =
(xj, yj) for each image |
|
What is the objective function? |
|
We have a set of matched
features pi,j = (ui,j, vi,j) |
|
For each point match (pi,j,
pi,j+1): |
|
pi,j+1 – pi,j
= tj+1 – tj |
Global optimization
Global optimization
Global optimization
Ambiguity in global
location
|
|
|
Each of these solutions has the
same error |
|
Called the gauge ambiguity |
|
Solution: fix the position of
one image (e.g., make the origin of the 1st image (0,0)) |
Solving for rotations
Solving for rotations
Parameterizing rotations
|
|
|
|
How do we parameterize R and
ΔR? |
|
Euler angles: bad idea |
|
quaternions: 4-vectors on unit
sphere |
|
Axis-angle representation
(Rodriguez Formula) |
Nonlinear Least Squares
Camera Calibration
Camera calibration
|
|
|
Determine camera parameters
from known 3D points or calibration object(s) |
|
internal or intrinsic
parameters such as focal length, optical center, aspect ratio:
what kind of camera? |
|
external or extrinsic (pose)
parameters:
where is the camera? |
|
How can we do this? |
Camera calibration –
approaches
|
|
|
Possible approaches: |
|
linear regression (least
squares) |
|
non-linear optimization |
|
vanishing points |
|
multiple planar patterns |
|
panoramas (rotational motion) |
Image formation equations
Calibration matrix
|
|
|
Is this form of K good enough? |
|
non-square pixels (digital
video) |
|
skew |
|
radial distortion |
Camera matrix
|
|
|
Fold intrinsic calibration
matrix K and extrinsic pose parameters (R,t) together into a
camera matrix |
|
M = K [R | t ] |
|
|
|
|
|
|
|
|
|
(put 1 in lower r.h. corner for
11 d.o.f.) |
Camera matrix calibration
|
|
|
Directly estimate 11 unknowns
in the M matrix using known 3D points (Xi,Yi,Zi)
and measured feature positions (ui,vi) |
Camera matrix calibration
|
|
|
|
Linear regression: |
|
Bring denominator over, solve
set of (over-determined) linear equations.
How? |
|
Least squares (pseudo-inverse) |
|
Is this good enough? |
Camera matrix calibration
|
|
|
|
Advantages: |
|
very simple to formulate and
solve |
|
can recover K [R | t] from M
using QR decomposition [Golub & VanLoan 96] |
|
Disadvantages: |
|
doesn't compute internal
parameters |
|
can give garbage results |
|
more unknowns than true degrees
of freedom |
|
need a separate camera matrix
for each new view |
Multi-plane calibration
|
|
|
|
|
Use several images of planar
target held at unknown orientations [Zhang 99] |
|
Compute plane homographies |
|
|
|
|
|
Solve for K-TK-1
from Hk’s |
|
1 plane if only f unknown |
|
2 planes if (f,uc,vc)
unknown |
|
3+ planes for full K |
|
Code available from Zhang and
OpenCV |
Pose estimation and
triangulation
Pose estimation
|
|
|
|
Use inter-point distance
constraints |
|
[Quan 99][Ameller 00] |
|
|
|
|
|
|
|
|
|
|
|
Solve set of polynomial
equations in xi2p |
|
Recover R,t using procrustes
analysis. |
Triangulation
|
|
|
|
|
Problem: Given some points in correspondence across
two or more images (taken from calibrated cameras), {(uj,vj)},
compute the 3D location X |
Triangulation
|
|
|
|
Method I: intersect viewing
rays in 3D, minimize: |
|
X is the unknown 3D point |
|
Cj is the optical
center of camera j |
|
Vj is the viewing
ray for pixel (uj,vj) |
|
sj is unknown
distance along Vj |
|
Advantage: geometrically
intuitive |
Triangulation
|
|
|
|
Method II: solve linear
equations in X |
|
advantage: very simple |
|
|
|
|
|
Method III: non-linear
minimization |
|
advantage: most accurate (image
plane error) |
Structure from Motion
Structure from motion
|
|
|
Given many points in correspondence
across several images, {(uij,vij)}, simultaneously
compute the 3D location xi and camera (or motion) parameters (K, Rj,
tj) |
|
|
|
|
|
|
|
Two main variants: calibrated,
and uncalibrated (sometimes associated with Euclidean and projective
reconstructions) |
Orthographic SFM
|
|
|
[Tomasi & Kanade, IJCV 92] |
Extensions
|
|
|
Paraperspective |
|
[Poelman & Kanade, PAMI 97] |
|
Sequential Factorization |
|
[Morita & Kanade, PAMI 97] |
|
Factorization under perspective |
|
[Christy & Horaud, PAMI 96] |
|
[Sturm & Triggs, ECCV 96] |
|
Factorization with Uncertainty |
|
[Anandan & Irani, IJCV 2002] |
SfM objective function
|
|
|
Given point x and rotation and
translation R, t |
|
|
|
|
|
|
|
|
|
Minimize sum of squared
reprojection errors: |
Scene reconstruction
Feature detection
Feature detection
Feature detection
|
|
|
Detect features using SIFT
[Lowe, IJCV 2004] |
Feature matching
|
|
|
Match features between each
pair of images |
Feature matching
Reconstruction
|
|
|
Choose two/three views to seed
the reconstruction. |
|
Add 3d points via
triangulation. |
|
Add cameras using pose
estimation. |
|
Bundle adjustment |
|
Goto step 2. |
Two-view structure from
motion
|
|
|
|
Simpler case: can consider motion
independent of structure |
|
|
|
|
|
|
|
|
|
|
|
Let’s first consider the case
where K is known |
|
Each image point (ui,j,
vi,j, 1) can be multiplied by K-1 to form a 3D ray |
|
We call this the calibrated case |
Notes on two-view
geometry
|
|
|
How can we express the epipolar
constraint? |
|
Answer: there is a 3x3 matrix E
such that |
|
p'TEp = 0 |
|
E is called the essential
matrix |
Properties of the
essential matrix
Properties of the
essential matrix
|
|
|
|
p'TEp = 0 |
|
Ep is the epipolar line associated with p |
|
e and e' are called epipoles: Ee
= 0 and ETe' = 0 |
|
E can be solved for with 5
point matches |
|
see Nister, An efficient
solution to the five-point relative pose problem. PAMI 2004. |
|
|
The Fundamental matrix
|
|
|
|
If K is not known, then we use
a related matrix called the Fundamental matrix, F |
|
Called the uncalibrated case |
|
|
|
|
|
|
|
F can be solved for linearly
with eight points, or non-linearly with six or seven points |
Photo Tourism overview