Project 2 Report: Autostitch
CSE576, Spring 2008
Xiaoyu Chen
April 29, 2008
In this project, I developed a system to automatically stitch a sequence of images into a panorama. Firstly, applying the feature detector developed in the first project, the system detects the discriminative features from each image, and identifies the matching features for a pair of images. Secondly, the system aligns a pair of images by computing their relative positions and displacements, based on their best matching features. Finally, the system stitches the aligned pair-wise images and blends them into a seamless panorama.
Methods
Warp images
Given an image, the planar coordinates need to be converted into spherical coordinates. The equations of coordinates mapping were given in Dr. Szeliski’s lecture notes “image stitiching” (p.46). Moreover, the radial distortion in the planar coordinates should be removed after mapping to the spherical coordinates. The equations of modeling radial distortion were given in Dr. Seitz’s lecture notes “projection” (p.28). For this step, the camera parameters including focal length and radial distortion parameters (i.e. K1 and K2) are provided.
Align pair-wise images
The matching features of a pair of images can be identified by applying the first project. Based on those matching features, we can estimate the motion model for the two images. In this project, we assume that the motion model is translation only. Especially, RANSAC is used to select a set of feature matches to estimate the motion model.
- Select a random pair of matching features. Compute the coordinate translation (dx, dy) of the two features.
- Compute the number of feature matches that are consistent with the estimated displacements (dx, dy). Given a pair of matching features F1 from image #1 and F2 from image #2, applying (dx, dy) to F1, we get an estimated match F1’ in image #2. We then measure the Euclidean distance between F1’ and F2. If the distance is smaller than a given threshold, we call this pair of match features an inlier. In another word, we compute the number of inliers for the estimated displacement.
- Repeat the above two steps many times (specified by a parameter). Record the best match with the largest number of inliers, and return the set of its inliers.
- Based on that set of inliers, compute the estimated translation using least square fit, which is simply the average translation among all inlier matches.
Stitch the aligned images
Given the warped, aligned images with the pair-wise translation, the final step is to stitch the images into a seamless panorama.
- Measure the width and height of the final stitched image.
- Locate each image at its final position. Blend an image with its adjacent images if there is any. If two images overlap, the final image should be the weighted sum of the two images. A 1-D distance map is used as the feathering function. i.e. A pixel in an given image is weighted proportional to its 1-D distance to the image’s edge that overlaps with the other image.
- Because we take images for 360 degree panorama, we align the first image at both ends of the image sequence. The horizontal extent of the panorama is the mid line in that first image at both ends.
- Moreover, aligning the first image to the last image of the sequence allows us to measure the vertical drift between them. We can apply a global correction to such vertical drift: y’ = y + ax, where a is the average vertical drift over all columns in the panorama. After this correction, the vertical coordinates of the first and the last image will be the same.
Results
The test sequence
This is the provided test sequence.
full resolution,
interactive viewer
A sequence taken with Kaidan head
Sequence I was taken in the CSE Atrium using Kaidan head.
full resolution,
interactive viewer
Another sequence taken with Kaidan head
Sequence II was taken on the campus using Kaidan head.
full resolution,
interactive viewer
A hand-held sequence
Sequence III was taken on the campus without using Kaidan head.
full resolution,
interactive viewer
Discussion
The system has succeeded in automatically stitching four different image sequences.
In the panorama of Sequence I, the upper right corner of the LED wall is a little bit blur. This is because the light flow was moving, and it could introduce errors when matching features along the wall.
In the panorama of Sequence II, the wall close to the right end is not aligned very well. This actually shows the importance of the feature detector in the first place. Due to the color of the wall, it may be difficult for the feature detector to find the correct feature matches on the wall.
In the panorama of Sequence III, there are a couple of places along the baluster with slight ghosts. Because this is a hand-held sequence, the camera might be slightly rotated when taking the pictures. Therefore, our simple motion model of translation may not capture the motion completely.