Project 2: Panoramic Mosaic Stitching
- Assigned: Wednesday, April 17, 2013
- Due: Tuesday, April 30, 2013 (11:59pm)
In this project, you will use the feature detection and
matching from the first project to combine a series of
photographs into a 360° panorama. Your software will
automatically align the photographs (determine their overlap
and relative positions) and then blend the resulting photos
into a single seamless panorama. You will then be able to view
the resulting panorama inside an interactive Web viewer.
You can download the complete project
. The package consists of the following
- PanoramaSkel: The skeleton code for the project.
You can use Visual Studio (Windows) or Makefile(Linux) to
compile it. (NOTE: In Visual Studio, please use the
project in the release mode.)
- PanoramaSample.exe: Sample executable for Windows
- PanoramaSample32: Sample executable for 32 bit
- PanoramaSample64: Sample executable for 64 bit
- lpjpano: Web-based viewer for your panoramas.
Please read the README file for instructions on using this.
- test_set: Some sample images to test the system on.
It also includes the intermediate files generated in process
of stitching the images. Please see the next section for
steps to create a panorama.
Taking the Pictures
Besides working with a sample test_set
provided in the
package, we also need to go out and make your own panoramas!
Each student will be checking out a panorama kit (camera,
tripod, and Kaidan head). You can use
to make your reservation. You can only
reserve for one day continuously, except for weekends. Please
find more instructions on the reservations page.
Take a series of photos with a digital camera mounted on a
read this web page explaining how to use the equipment
before you go out to shoot. As shown on this page, the camera
MUST be right side up and should be zoomed out all the way.
The resolution should be set to capturing 640x480 photos. You
can change that setting by following these steps -
- Turn the mode dial on the back of the camera to one of the
3 shooting modes--auto (camera icon), manual (camera icon +
M) or stitch assist (overlaid rectangles).
- Press MENU button.
- Press the left/right arrow to choose Resolution, then
- Press the left/right arrow and choose S (640x480).
- Press MENU again.
Since the camera is right side up, you will need to rotate
the 640x480 images later to make them upright. Hence the final
size of all your images will be 480x640. For this you can use
any image manipulation software. We recommend using Irfanview or Gimp.
For best results, overlap each image by 50% with the previous
one, and keep the camera level using the levelers on the
Kaidan head. You will also need to take a series of images
with the camera held in your hand instead of the tripod.
Again, try to overlap each image by 50% with the previous one.
The following focal lengths are valid only if the camera is
zoomed out all the way:
|Canon Powershot A10, tag CS30012716
|Canon Powershot A10, tag CS30012717
|Canon Powershot A10, tag CS30012718
|Canon Powershot A10, tag CS30012927
|Canon Powershot A10, tag CS30012928
|Canon Powershot A10, tag CS30012929
If you are using your own camera, you have to estimate the
focal length and distortion parameters. The simplest way to do
this is through the EXIF tags of the images, as
described by Noah Snavely (a previous TA).
Alternatively, you can use a camera
calibration toolkit to get more precise focal length and
radial distortion coefficients. Finally, Brett Allen describes
one creative way to measure rough focal length using just
a book and a box.
- [IMPORTANT] Your
images need to be in .TGA format and have a 4:3 (or 3:4)
aspect ratio in order to be compatible with the project
- [IMPORTANT] Your
output panoramas need to be in .JPG format in order to be
compatible with the java-based panorama viewer (described
- Your input images should be kept reasonably small, e.g.
480x640. The computation time for larger images may be
- You can convert or resize images using tools such as Gimp or IrfanView etc.
Running the code to create the panorama
You will need two executables - Panorama
project) and Features
(from project 1). Open the console
(using the cmd command from start menu in Windows or the
standard console in Linux) and navigate to the folder with the
images that you want to stitch. The instructions in this section
assume that both the above executables are in the same folder as
the images. If that is not the case, just call the executable
from the appropriate location.
- Remove radial distortion and warp all images to
spherical coordinate system: To remove the radial
distortion and warp an image input.tga into
spherical coordinate with focal length = 600, radial
distortion coefficients k1=-0.21 and k2=0.25:
warp.tga is the name of the output image. Generate
warped images for all the input images. The values of focal
length and the distortion parameters for the test_set images
and for the course cameras are provided in the table above.
Panorama sphrWarp input.tga warp.tga 600 -0.21 0.25
- Compute features in the warped images: Use the Features
executable to do this as in the first project.
We encourage to use your own features from Project 1 for
this step. However, you may also choose to use the
state-of-the art SIFT features. Here is the package to
compute SIFT features (linked from David
Lowe's page) - <Link
to SIFT package>. The SIFT package has a README
file which has very clear instructions about generating SIFT
features for a given image and also visualize them using
Linux/Windows/MATLAB. If you want to play more with SIFT
features, the README file describes more ways to visualize
how SIFT features can be used to match images.
Features computeFeatures warp.tga warp.f [featuretype]
Here is the gist of how to generate .key files (SIFT feature
files for images) for this project,
- Convert the image into .pgm format using a standard
tool like Irfanview.
- Run: sift <input.pgm >output.key
sift is the appropriate executable for Windows
(siftWin32.exe) or Linux (sift) provided in the package.
- Match features between every pair of adjacent images:
For example, to match features of images - warp1.f
Features matchFeatures warp1.f warp2.f 0.8
- Align every pair of adjacent images using RANSAC:
For example, to align images - warp1.tga and warp2.tga:
where the match file was produced in the last step. 200 and
1 are the parameters for RANSAC - number of RANSAC
interations and RANSAC distance threshold respectively. This
step will output two numbers on the screen corresponding to
the resulting translation for alignment.
Panorama alignPair warp1.f warp2.f match-01-02.txt 200
Note that you can also use SIFT features to do the
alignment, which can be useful for testing this component.
To do so, add the work sift to the end of the command, as
Sample SIFT features and matches have been provided to you
in the test_set folder. Run the previous step for
all adjacent pairs of images and save the output into a
separate file pairlist.txt which may look like this:
Panorama alignPair warp1.key warp2.key match-01-02.txt
200 1 sift
warp1.tga warp2.tga 213.49 -5.12
warp2.tga warp3.tga 208.19 2.82
warp9.tga warp1.tga 194.76 -3.88
The last two numbers in each line are the numbers from the
output of running Panorama alignPair on those
- Blending all images: Finally stitch the images into
the final panorama pano.tga:
Panorama blendPairs pairlist.txt pano.tga blendWidth
These steps for the sample images in the test_set
are also provided in stitch2.txt
(for stitching only two
images) and in stitch4.txt
(for stitching four images).
Visualizing the panorama with a web-viewer
We provide a java-based web viewer for your panoramas. It is
fairly straightforward to use it and the instructions can be
found in README file of lpjpano
folder in the project
package. You will be required to include this in your
deliverable for the project.
Note: The skeleton code includes the same image library that
you used in the first project.
Warp each image into spherical coordinates. (file: WarpSpherical.cpp,
[TODO] Compute the inverse map to warp the image
by filling in the skeleton code in the warpSphericalField
- Convert the given spherical image coordinate into the
corresponding planar image coordinate using the
coordinate transformation equation from the lecture
- Apply radial distortion using the equation from the
Compute the alignment of two images. (file: FeatureAlign.cpp,
routines: alignPair, countInliers, and leastSquaresFit)
To do this, you will have to implement a feature-based
translational motion estimation. The skeleton for this
code is provided in FeatureAlign.cpp. The main
routines that you will be implementing are:
int alignPair(const FeatureSet &f1, const
FeatureSet &f2, const vector
&matches, MotionModel m, float f, int nRANSAC,
double RANSACthresh, CTransform3x3& M);
int countInliers(const FeatureSet &f1, const
FeatureSet &f2, const vector
&matches, MotionModel m, float f, CTransform3x3 M,
double RANSACthresh, vector &inliers);
int leastSquaresFit(const FeatureSet &f1, const
FeatureSet &f2, const vector
&matches, MotionModel m, float f, const vector
&inliers, CTransform3x3& M);
AlignPair takes two feature sets, f1 and f2,
the list of feature matches obtained from the feature
detection and matching (from the first project), a motion
model (described below), and estimates and inter-image
transform matrix M. For this project, the enum MotionModel
only takes on the value eTranslate.
AlignPair uses RANSAC (RAndom SAmpling Consensus)
to pull out a minimal set of feature matches (one match
for this project), estimates the corresponding motion
(alignment) and then invokes countInliers to count
how many of the feature matches agree with the current
motion estimate. After repeated trials, the motion
estimate with the largest number of inliers is used to
compute a least squares estimate for the motion, which is
then returned in the motion estimate M.
CountInliers computes the number of matches that
have a distance below RANSACthresh is computed. It
also returns a list of inlier match ids.
LeastSquaresFit computes a least squares estimate
for the translation using all of the matches previously
estimated as inliers. It returns the resulting translation
estimate in the last column of M.
[TODO] You will have to fill in the missing code
in alignPair to:
- Randomly select a valid matching pair and compute the
translation between the two feature locations.
- Call countInliers to count how many matches
agree with this estimate.
- Repeat the above random selection nRANSAC
times and keep the estimate with the largest number of
- Write the body of countInliers to count the
number of feature matches where the SSD distance after
applying the estimated transform (i.e. the distance from
the match to its correct position in the image) is below
the threshold. Don't forget to create the list of inlier
- Write the body of leastSquaresFit, which for
the simple translational case is just the average
displacement between the matching feature positions.
Stitch and crop the resulting aligned images. (file: BlendImages.cpp,
routines: BlendImages, AccumulateBlend, NormalizeBlend)
[TODO] Given the warped images and their relative
displacements, figure out how large the final stitched
image will be and their absolute displacements in the
[TODO] Then, resample each image to its final
location and blend it with its neighbors (AccumulateBlend,
NormalizeBlend). Try a simple feathering function
as your weighting function (see mosaics lecture slide on
"feathering") (this is a simple 1-D version of the
distance map described in Szeliski
& Shum '97). For extra credit, you can try other
blending functions or figure out some way to compensate
for exposure differences. In NormalizeBlend, remember to
set the alpha channel of the resultant panorama to opaque!
[TODO] Crop the resulting image to make the left
and right edges seam perfectly (BlendImages). The
horizontal extent can be computed in the previous blending
routine since the first image occurs at both the left and
right end of the stitched sequence (draw the "cut" line
halfway through this image). Use a linear warp to the
mosaic to remove any vertical "drift" between the first
and last image. This warp, of the form y' = y
+ ax, should transform the y coordinates
of the mosaic such that the first image has the same y-coordinate
on both the left and right end. Calculate the value of a
needed to perform this transformation.
You can use the test results included in the test_set
folder to check whether your program is running correctly.
Comparing your output to that of the sample solution is also a
good way of debugging your program.
What to turn-in
Please organize your submission in the following folder
[This is the top-level
[Place the source code in this
<Your_Name> => Executable
Artifact => index.html
[Writeup about the
project, see below for details of what to put here.]
Artifact => images/
all your images used in the webpage here.]
=> Artifact => voting.jpg
[One of your panorama images that you want to
submit for class-voting.]
In the artifact webpage, please put,
If you are unfamiliar with HTML you can use a
simple webpage editor like NVU or KompoZer to
make your web-page. Here are some tips.
- A short description of what worked well and
what didn't. If you tried several variants or
did something non-standard, please describe
this as well.
- Describe any extra credit with supporting
- Include at least three panoramas.
- The test_set sequence
- Captured using the Kaidan head
- Captured by holding the camera in hand
Each panorama should be shown as -
- A low-res inlined image on the web
- A link that you can click on to show
the full-resolution .jpg file.
- Embedded in a web viewer as described
How to Turn In
Create a zip archive of your submission folder -
<Your_Name>.zip and place it the Catalyst
submission dropbox before
April 30, 11:59pm.
Here is a list of suggestions for extending the
program for extra credit. You are encouraged to
come up with your own extensions. We're always
interested in seeing new, unanticipated ways to
use this program!
- Although the feature-based aligner gives
sub-pixel motion estimation (because of least
squares), the motion vectors are rounded to
integers when blending the images into the
mosaic in BlendImages.cpp. Try to blend images
with sub-pixel localization.
- Sometimes, there exists exposure difference
between images, which results in brightness
fluctuation in the final mosaic. Try to get
rid of this artifact.
- Try shooting a sequence with some objects
moving. What did you do to remove "ghosted"
versions of the objects?
- Try a sequence in which the same person
appears multiple times, as in this example.
- Implement a better blending technique, e.g.,
imaging blending, or graph