CSE 576: Computer Vision

Project 2: Panoramic Mosaic Stitching

Dates

Assigned: Wednesday, April 17, 2013
Due: Tuesday, April 30, 2013 (11:59pm)

Quick Links

In this project, you will use the feature detection and matching from the first project to combine a series of photographs into a 360° panorama. Your software will automatically align the photographs (determine their overlap and relative positions) and then blend the resulting photos into a single seamless panorama. You will then be able to view the resulting panorama inside an interactive Web viewer.

Project package

You can download the complete project package here. The package consists of the following components -

PanoramaSkel: The skeleton code for the project. You can use Visual Studio (Windows) or Makefile(Linux) to compile it. (NOTE: In Visual Studio, please use the project in the release mode.)
PanoramaSample.exe: Sample executable for Windows platform.
PanoramaSample32: Sample executable for 32 bit Linux platform.
PanoramaSample64: Sample executable for 64 bit Linux platform.
lpjpano: Web-based viewer for your panoramas. Please read the README file for instructions on using this.
test_set: Some sample images to test the system on. It also includes the intermediate files generated in process of stitching the images. Please see the next section for steps to create a panorama.

Taking the Pictures

Besides working with a sample test_set provided in the package, we also need to go out and make your own panoramas! Each student will be checking out a panorama kit (camera, tripod, and Kaidan head). You can use this webpage to make your reservation. You can only reserve for one day continuously, except for weekends. Please find more instructions on the reservations page.

Taking photos

Take a series of photos with a digital camera mounted on a tripod.

[IMPORTANT] Please read this web page explaining how to use the equipment before you go out to shoot. As shown on this page, the camera MUST be right side up and should be zoomed out all the way. The resolution should be set to capturing 640x480 photos. You can change that setting by following these steps -

Turn the mode dial on the back of the camera to one of the 3 shooting modes--auto (camera icon), manual (camera icon + M) or stitch assist (overlaid rectangles).
Press MENU button.
Press the left/right arrow to choose Resolution, then press SET.
Press the left/right arrow and choose S (640x480).
Press MENU again.

Since the camera is right side up, you will need to rotate the 640x480 images later to make them upright. Hence the final size of all your images will be 480x640. For this you can use any image manipulation software. We recommend using Irfanview or Gimp.

For best results, overlap each image by 50% with the previous one, and keep the camera level using the levelers on the Kaidan head. You will also need to take a series of images with the camera held in your hand instead of the tripod. Again, try to overlap each image by 50% with the previous one.

Camera Parameters

The following focal lengths are valid only if the camera is zoomed out all the way:

Camera	Resolution	Focal length	k1	k2
test_set images	384x512	595.00000 pixels	-0.15000	0.00000
Canon Powershot A10, tag CS30012716	480x640	678.21239 pixels	-0.21001	0.26169
Canon Powershot A10, tag CS30012717	480x640	677.50487 pixels	-0.20406	0.23276
Canon Powershot A10, tag CS30012718	480x640	676.48417 pixels	-0.20845	0.25624
Canon Powershot A10, tag CS30012927	480x640	671.16649 pixels	-0.19270	0.30098
Canon Powershot A10, tag CS30012928	480x640	674.82258 pixels	-0.21528	0.30098
Canon Powershot A10, tag CS30012929	480x640	674.79106 pixels	-0.21483	0.32286

If you are using your own camera, you have to estimate the focal length and distortion parameters. The simplest way to do this is through the EXIF tags of the images, as described by Noah Snavely (a previous TA). Alternatively, you can use a camera calibration toolkit to get more precise focal length and radial distortion coefficients. Finally, Brett Allen describes one creative way to measure rough focal length using just a book and a box.

Image formatting

[IMPORTANT] Your images need to be in .TGA format and have a 4:3 (or 3:4) aspect ratio in order to be compatible with the project skeleton.
[IMPORTANT] Your output panoramas need to be in .JPG format in order to be compatible with the java-based panorama viewer (described later).
Your input images should be kept reasonably small, e.g. 480x640. The computation time for larger images may be significant.
You can convert or resize images using tools such as Gimp or IrfanView etc.

Running the code to create the panorama

You will need two executables - Panorama (from this project) and Features (from project 1). Open the console (using the cmd command from start menu in Windows or the standard console in Linux) and navigate to the folder with the images that you want to stitch. The instructions in this section assume that both the above executables are in the same folder as the images. If that is not the case, just call the executable from the appropriate location.

Remove radial distortion and warp all images to spherical coordinate system: To remove the radial distortion and warp an image input.tga into spherical coordinate with focal length = 600, radial distortion coefficients k1=-0.21 and k2=0.25:

Panorama sphrWarp input.tga warp.tga 600 -0.21 0.25

warp.tga is the name of the output image. Generate warped images for all the input images. The values of focal length and the distortion parameters for the test_set images and for the course cameras are provided in the table above.
Compute features in the warped images: Use the Features executable to do this as in the first project.

Features computeFeatures warp.tga warp.f [featuretype]

We encourage to use your own features from Project 1 for this step. However, you may also choose to use the state-of-the art SIFT features. Here is the package to compute SIFT features (linked from David Lowe's page) - <Link to SIFT package>. The SIFT package has a README file which has very clear instructions about generating SIFT features for a given image and also visualize them using Linux/Windows/MATLAB. If you want to play more with SIFT features, the README file describes more ways to visualize how SIFT features can be used to match images.

Here is the gist of how to generate .key files (SIFT feature files for images) for this project,
- Convert the image into .pgm format using a standard tool like Irfanview.
- Run: sift <input.pgm >output.key
  sift is the appropriate executable for Windows (siftWin32.exe) or Linux (sift) provided in the package.
Match features between every pair of adjacent images: For example, to match features of images - warp1.f and warp2.f:

Features matchFeatures warp1.f warp2.f 0.8 match-01-02.txt 2
Align every pair of adjacent images using RANSAC: For example, to align images - warp1.tga and warp2.tga:

Panorama alignPair warp1.f warp2.f match-01-02.txt 200 1

where the match file was produced in the last step. 200 and 1 are the parameters for RANSAC - number of RANSAC interations and RANSAC distance threshold respectively. This step will output two numbers on the screen corresponding to the resulting translation for alignment.

Note that you can also use SIFT features to do the alignment, which can be useful for testing this component. To do so, add the work sift to the end of the command, as in:

Panorama alignPair warp1.key warp2.key match-01-02.txt 200 1 sift

Sample SIFT features and matches have been provided to you in the test_set folder. Run the previous step for all adjacent pairs of images and save the output into a separate file pairlist.txt which may look like this:

warp1.tga warp2.tga 213.49 -5.12
warp2.tga warp3.tga 208.19 2.82
......
warp9.tga warp1.tga 194.76 -3.88

The last two numbers in each line are the numbers from the output of running Panorama alignPair on those images.
Blending all images: Finally stitch the images into the final panorama pano.tga:

Panorama blendPairs pairlist.txt pano.tga blendWidth

These steps for the sample images in the test_set folder are also provided in stitch2.txt (for stitching only two images) and in stitch4.txt (for stitching four images).

Visualizing the panorama with a web-viewer

We provide a java-based web viewer for your panoramas. It is fairly straightforward to use it and the instructions can be found in README file of lpjpano folder in the project package. You will be required to include this in your deliverable for the project.

To Do

Note: The skeleton code includes the same image library that you used in the first project.

Warp each image into spherical coordinates. (file: WarpSpherical.cpp, routine: warpSphericalField)

[TODO] Compute the inverse map to warp the image by filling in the skeleton code in the warpSphericalField routine to:
- Convert the given spherical image coordinate into the corresponding planar image coordinate using the coordinate transformation equation from the lecture notes
- Apply radial distortion using the equation from the lecture notes
Compute the alignment of two images. (file: FeatureAlign.cpp, routines: alignPair, countInliers, and leastSquaresFit)

To do this, you will have to implement a feature-based translational motion estimation. The skeleton for this code is provided in FeatureAlign.cpp. The main routines that you will be implementing are:

int alignPair(const FeatureSet &f1, const FeatureSet &f2, const vector &matches, MotionModel m, float f, int nRANSAC, double RANSACthresh, CTransform3x3& M);

int countInliers(const FeatureSet &f1, const FeatureSet &f2, const vector &matches, MotionModel m, float f, CTransform3x3 M, double RANSACthresh, vector &inliers);

int leastSquaresFit(const FeatureSet &f1, const FeatureSet &f2, const vector &matches, MotionModel m, float f, const vector &inliers, CTransform3x3& M);

AlignPair takes two feature sets, f1 and f2, the list of feature matches obtained from the feature detection and matching (from the first project), a motion model (described below), and estimates and inter-image transform matrix M. For this project, the enum MotionModel only takes on the value eTranslate.

AlignPair uses RANSAC (RAndom SAmpling Consensus) to pull out a minimal set of feature matches (one match for this project), estimates the corresponding motion (alignment) and then invokes countInliers to count how many of the feature matches agree with the current motion estimate. After repeated trials, the motion estimate with the largest number of inliers is used to compute a least squares estimate for the motion, which is then returned in the motion estimate M.

CountInliers computes the number of matches that have a distance below RANSACthresh is computed. It also returns a list of inlier match ids.

LeastSquaresFit computes a least squares estimate for the translation using all of the matches previously estimated as inliers. It returns the resulting translation estimate in the last column of M.

[TODO] You will have to fill in the missing code in alignPair to:
- Randomly select a valid matching pair and compute the translation between the two feature locations.
- Call countInliers to count how many matches agree with this estimate.
- Repeat the above random selection nRANSAC times and keep the estimate with the largest number of inliers.
- Write the body of countInliers to count the number of feature matches where the SSD distance after applying the estimated transform (i.e. the distance from the match to its correct position in the image) is below the threshold. Don't forget to create the list of inlier ids.
- Write the body of leastSquaresFit, which for the simple translational case is just the average displacement between the matching feature positions.
Stitch and crop the resulting aligned images. (file: BlendImages.cpp, routines: BlendImages, AccumulateBlend, NormalizeBlend)

[TODO] Given the warped images and their relative displacements, figure out how large the final stitched image will be and their absolute displacements in the panorama (BlendImages).

[TODO] Then, resample each image to its final location and blend it with its neighbors (AccumulateBlend, NormalizeBlend). Try a simple feathering function as your weighting function (see mosaics lecture slide on "feathering") (this is a simple 1-D version of the distance map described in Szeliski & Shum '97). For extra credit, you can try other blending functions or figure out some way to compensate for exposure differences. In NormalizeBlend, remember to set the alpha channel of the resultant panorama to opaque!

[TODO] Crop the resulting image to make the left and right edges seam perfectly (BlendImages). The horizontal extent can be computed in the previous blending routine since the first image occurs at both the left and right end of the stitched sequence (draw the "cut" line halfway through this image). Use a linear warp to the mosaic to remove any vertical "drift" between the first and last image. This warp, of the form y' = y + ax, should transform the y coordinates of the mosaic such that the first image has the same y-coordinate on both the left and right end. Calculate the value of a needed to perform this transformation.

Debugging Guidelines

You can use the test results included in the test_set folder to check whether your program is running correctly. Comparing your output to that of the sample solution is also a good way of debugging your program.

What to turn-in

Please organize your submission in the following folder structure.

<Your_Name>                                                    [This is the top-level folder]
<Your_Name> => Source                                  [Place the source code in this subfolder]
<Your_Name> => Executable                            [Windows/Linux executable]
<Your_Name> => Artifact
<Your_Name> => Artifact => index.html            [Writeup about the project, see below for details of what to put here.]
<Your_Name> => Artifact => images/             [Place all your images used in the webpage here.]
<Your_Name> => Artifact => voting.jpg            [One of your panorama images that you want to submit for class-voting.]

In the artifact webpage, please put,

A short description of what worked well and what didn't. If you tried several variants or did something non-standard, please describe this as well.
Describe any extra credit with supporting examples.
Include at least three panoramas.
1. The test_set sequence
2. Captured using the Kaidan head
3. Captured by holding the camera in hand
Each panorama should be shown as -
1. A low-res inlined image on the web page.
2. A link that you can click on to show the full-resolution .jpg file.
3. Embedded in a web viewer as described above.

If you are unfamiliar with HTML you can use a simple webpage editor like NVU or KompoZer to make your web-page. Here are some tips.

How to Turn In

Create a zip archive of your submission folder - <Your_Name>.zip and place it the Catalyst submission dropbox before April 30, 11:59pm.

Extra Credit

Here is a list of suggestions for extending the program for extra credit. You are encouraged to come up with your own extensions. We're always interested in seeing new, unanticipated ways to use this program!

Although the feature-based aligner gives sub-pixel motion estimation (because of least squares), the motion vectors are rounded to integers when blending the images into the mosaic in BlendImages.cpp. Try to blend images with sub-pixel localization.
Sometimes, there exists exposure difference between images, which results in brightness fluctuation in the final mosaic. Try to get rid of this artifact.
Try shooting a sequence with some objects moving. What did you do to remove "ghosted" versions of the objects?
Try a sequence in which the same person appears multiple times, as in this example.
Implement a better blending technique, e.g., pyramid blending, poisson imaging blending, or graph cuts.

Computer Vision

CSE 576, Spring 2013