Project 2:  Panoramic Mosaic Stitching
          
          
          
            
              Dates
               
              
                 
                - Assigned: Wednesday, April 17, 2013
- Due: Tuesday, April 30, 2013 (11:59pm)
Quick Links
              
             
           
         
        
        In this project, you will use the feature detection and
          matching from the first project to combine a series of
          photographs into a 360° panorama. Your software will
          automatically align the photographs (determine their overlap
          and relative positions) and then blend the resulting photos
          into a single seamless panorama. You will then be able to view
          the resulting panorama inside an interactive Web viewer.
        Project package
        You can download the complete 
project
          package here. The package consists of the following
        components -
        
 
        
          - PanoramaSkel: The skeleton code for the project.
            You can use Visual Studio (Windows) or Makefile(Linux) to
            compile it. (NOTE: In Visual Studio, please use the
              project in the release mode.)
- PanoramaSample.exe: Sample executable for Windows
            platform.
- PanoramaSample32: Sample executable for 32 bit
            Linux platform.
- PanoramaSample64: Sample executable for 64 bit
            Linux platform.
- lpjpano: Web-based viewer for your panoramas.
            Please read the README file for instructions on using this.
- test_set: Some sample images to test the system on.
            It also includes the intermediate files generated in process
            of stitching the images. Please see the next section for
            steps to create a panorama.
Taking the Pictures
        Besides working with a sample 
test_set provided in the
        package, we also need to go out and make your own panoramas!
        Each student will be checking out a panorama kit (camera,
        tripod, and Kaidan head). You can use 
          this webpage to make your reservation. You can only
        reserve for one day continuously, except for weekends. Please
        find more instructions on the reservations page. 
        
Taking photos
        Take a series of photos with a digital camera mounted on a
          tripod. 
        
        [IMPORTANT]  Please
            read this web page explaining how to use the equipment
          before you go out to shoot. As shown on this page, the camera
          MUST be right side up and should be zoomed out all the way.
          The resolution should be set to capturing 640x480 photos. You
          can change that setting by following these steps - 
        
        
          - Turn the mode dial on the back of the camera to one of the
            3 shooting modes--auto (camera icon), manual (camera icon +
            M) or stitch assist (overlaid rectangles).
- Press MENU button.
- Press the left/right arrow to choose Resolution, then
            press SET.
- Press the left/right arrow and choose S (640x480).
- Press MENU again.
Since the camera is right side up, you will need to rotate
          the 640x480 images later to make them upright. Hence the final
          size of all your images will be 480x640. For this you can use
          any image manipulation software. We recommend using Irfanview or Gimp.
        
        For best results, overlap each image by 50% with the previous
          one, and keep the camera level using the levelers on the
          Kaidan head. You will also need to take a series of images
          with the camera held in your hand instead of the tripod.
          Again, try to overlap each image by 50% with the previous one.
          
        
        
        
Camera Parameters
        The following focal lengths are valid only if the camera is
          zoomed out all the way:
        
               
            
              | Camera | Resolution | Focal length | k1 | k2 | 
            
              | test_set images | 384x512 | 595.00000 pixels | -0.15000 | 0.00000 | 
            
              | Canon Powershot A10, tag CS30012716 | 480x640 | 678.21239 pixels | -0.21001 | 0.26169 | 
            
              | Canon Powershot A10, tag CS30012717 | 480x640 | 677.50487 pixels | -0.20406 | 0.23276 | 
            
              | Canon Powershot A10, tag CS30012718 | 480x640 | 676.48417 pixels | -0.20845 | 0.25624 | 
            
              | Canon Powershot A10, tag CS30012927 | 480x640 | 671.16649 pixels | -0.19270 | 0.30098 | 
            
              | Canon Powershot A10, tag CS30012928 | 480x640 | 674.82258 pixels | -0.21528 | 0.30098 | 
            
              | Canon Powershot A10, tag CS30012929 | 480x640 | 674.79106 pixels | -0.21483 | 0.32286 | 
          
        
        If you are using your own camera, you have to estimate the
          focal length and distortion parameters. The simplest way to do
          this is through the EXIF tags of the images, as
            described by Noah Snavely (a previous TA).
          Alternatively, you can use a camera
            calibration toolkit to get more precise focal length and
          radial distortion coefficients. Finally, Brett Allen describes
          one creative way to measure rough focal length using just
            a book and a box.
        Image formatting
        
          - [IMPORTANT] Your
            images need to be in .TGA format and have a 4:3 (or 3:4)
            aspect ratio in order to be compatible with the project
            skeleton.
- [IMPORTANT] Your
            output panoramas need to be in .JPG format in order to be
            compatible with the java-based panorama viewer (described
            later).
- Your input images should be kept reasonably small, e.g.
            480x640. The computation time for larger images may be
            significant.
- You can convert or resize images using tools such as Gimp or IrfanView etc.
Running the code to create the panorama
        You will need two executables - 
Panorama (from this
        project) and 
Features (from project 1). Open the console
        (using the cmd command from start menu in Windows or the
        standard console in Linux) and navigate to the folder with the
        images that you want to stitch. The instructions in this section
        assume that both the above executables are in the same folder as
        the images. If that is not the case, just call the executable
        from the appropriate location.
        
 
        
          - Remove radial distortion and warp all images to
              spherical coordinate system: To remove the radial
            distortion and warp an image input.tga into
            spherical coordinate with focal length = 600, radial
            distortion coefficients k1=-0.21 and k2=0.25:
            
              Panorama sphrWarp input.tga warp.tga 600 -0.21 0.25 
 warp.tga is the name of the output image. Generate
            warped images for all the input images. The values of focal
            length and the distortion parameters for the test_set images
            and for the course cameras are provided in the table above.
- Compute features in the warped images: Use the Features
            executable to do this as in the first project.
            
              Features computeFeatures warp.tga warp.f [featuretype]
 
 
 We encourage to use your own features from Project 1 for
            this step. However, you may also choose to use the
            state-of-the art SIFT features. Here is the package to
            compute SIFT features (linked from David
              Lowe's page) - <Link
              to SIFT package>. The SIFT package has a README
            file which has very clear instructions about generating SIFT
            features for a given image and also visualize them using
            Linux/Windows/MATLAB. If you want to play more with SIFT
            features, the README file describes more ways to visualize
            how SIFT features can be used to match images.
 
 Here is the gist of how to generate .key files (SIFT feature
            files for images) for this project,
 
              - Convert the image into .pgm format using a standard
                tool like Irfanview.
- Run: sift <input.pgm >output.key
 sift is the appropriate executable for Windows
                (siftWin32.exe) or Linux (sift) provided in the package.
 
 
 
 
- Match features between every pair of adjacent images:
            For example, to match features of images - warp1.f
            and warp2.f:
            
              Features matchFeatures warp1.f warp2.f 0.8
                match-01-02.txt 2 
 
- Align every pair of adjacent images using RANSAC:
            For example, to align images - warp1.tga and warp2.tga:
            
              Panorama alignPair warp1.f warp2.f match-01-02.txt 200
                1 
 where the match file was produced in the last step. 200 and
            1 are the parameters for RANSAC - number of RANSAC
            interations and RANSAC distance threshold respectively. This
            step will output two numbers on the screen corresponding to
            the resulting translation for alignment.
 
 Note that you can also use SIFT features to do the
            alignment, which can be useful for testing this component.
            To do so, add the work sift to the end of the command, as
            in:
              Panorama alignPair warp1.key warp2.key match-01-02.txt
                200 1 sift  
 Sample SIFT features and matches have been provided to you
            in the test_set folder. Run the previous step for
            all adjacent pairs of images and save the output into a
            separate file pairlist.txt which may look like this: 
 warp1.tga warp2.tga 213.49 -5.12
 warp2.tga warp3.tga 208.19 2.82
 ......
 warp9.tga warp1.tga 194.76 -3.88
 
 The last two numbers in each line are the numbers from the
              output of running Panorama alignPair on those
              images.
 
- Blending all images: Finally stitch the images into
            the final panorama pano.tga:
            
              Panorama blendPairs pairlist.txt pano.tga blendWidth 
 
        These steps for the sample images in the 
test_set folder
        are also provided in 
stitch2.txt (for stitching only two
        images) and in 
stitch4.txt (for stitching four images).
        
Visualizing the panorama with a web-viewer
        We provide a java-based web viewer for your panoramas. It is
        fairly straightforward to use it and the instructions can be
        found in README file of 
lpjpano folder in the project
        package. You will be required to include this in your
        deliverable for the project.
        
To Do
        Note: The skeleton code includes the same image library that
          you used in the first project.
        
          - 
            Warp each image into spherical coordinates. (file: WarpSpherical.cpp,
              routine: warpSphericalField) [TODO] Compute the inverse map to warp the image
              by filling in the skeleton code in the warpSphericalField
              routine to: 
             
              - Convert the given spherical image coordinate into the
                corresponding planar image coordinate using the
                coordinate transformation equation from the lecture
                notes
- Apply radial distortion using the equation from the
                lecture notes 
 
 
 
             
- 
            Compute the alignment of two images. (file: FeatureAlign.cpp,
              routines: alignPair, countInliers, and leastSquaresFit) To do this, you will have to implement a feature-based
              translational motion estimation. The skeleton for this
              code is provided in FeatureAlign.cpp. The main
              routines that you will be implementing are: 
              int alignPair(const FeatureSet &f1, const
                FeatureSet &f2, const vector
                  &matches, MotionModel m, float f, int nRANSAC,
                  double RANSACthresh, CTransform3x3& M); int countInliers(const FeatureSet &f1, const
                FeatureSet &f2, const vector
                  &matches, MotionModel m, float f, CTransform3x3 M,
                  double RANSACthresh, vector &inliers); int leastSquaresFit(const FeatureSet &f1, const
                FeatureSet &f2, const vector
                  &matches, MotionModel m, float f, const vector
                    &inliers, CTransform3x3& M); 
 AlignPair takes two feature sets, f1 and f2,
              the list of feature matches obtained from the feature
              detection and matching (from the first project), a motion
              model (described below), and estimates and inter-image
              transform matrix M. For this project, the enum MotionModel
              only takes on the value eTranslate. AlignPair uses RANSAC (RAndom SAmpling Consensus)
              to pull out a minimal set of feature matches (one match
              for this project), estimates the corresponding motion
              (alignment) and then invokes countInliers to count
              how many of the feature matches agree with the current
              motion estimate. After repeated trials, the motion
              estimate with the largest number of inliers is used to
              compute a least squares estimate for the motion, which is
              then returned in the motion estimate M. CountInliers computes the number of matches that
              have a distance below RANSACthresh is computed. It
              also returns a list of inlier match ids. LeastSquaresFit computes a least squares estimate
              for the translation using all of the matches previously
              estimated as inliers. It returns the resulting translation
              estimate in the last column of M. [TODO] You will have to fill in the missing code
              in alignPair to: 
              - Randomly select a valid matching pair and compute the
                translation between the two feature locations.
- Call countInliers to count how many matches
                agree with this estimate.
- Repeat the above random selection nRANSAC
                times and keep the estimate with the largest number of
                inliers.
- Write the body of countInliers to count the
                number of feature matches where the SSD distance after
                applying the estimated transform (i.e. the distance from
                the match to its correct position in the image) is below
                the threshold. Don't forget to create the list of inlier
                ids.
- Write the body of leastSquaresFit, which for
                the simple translational case is just the average
                displacement between the matching feature positions.
 
 
 
- 
            Stitch and crop the resulting aligned images. (file: BlendImages.cpp,
              routines: BlendImages, AccumulateBlend, NormalizeBlend) [TODO] Given the warped images and their relative
              displacements, figure out how large the final stitched
              image will be and their absolute displacements in the
              panorama (BlendImages). [TODO] Then, resample each image to its final
              location and blend it with its neighbors (AccumulateBlend,
              NormalizeBlend). Try a simple feathering function
              as your weighting function (see mosaics lecture slide on
              "feathering") (this is a simple 1-D version of the
              distance map described in Szeliski
                & Shum '97). For extra credit, you can try other
              blending functions or figure out some way to compensate
              for exposure differences. In NormalizeBlend, remember to
              set the alpha channel of the resultant panorama to opaque! [TODO] Crop the resulting image to make the left
              and right edges seam perfectly (BlendImages). The
              horizontal extent can be computed in the previous blending
              routine since the first image occurs at both the left and
              right end of the stitched sequence (draw the "cut" line
              halfway through this image). Use a linear warp to the
              mosaic to remove any vertical "drift" between the first
              and last image. This warp, of the form y' = y
              + ax, should transform the y coordinates
              of the mosaic such that the first image has the same y-coordinate
              on both the left and right end. Calculate the value of a
              needed to perform this transformation. 
Debugging Guidelines
        You can use the test results included in the test_set
          folder to check whether your program is running correctly.
          Comparing your output to that of the sample solution is also a
          good way of debugging your program.
        What to turn-in
        Please organize your submission in the following folder
        structure. 
        
        
<Your_Name>   
                           
                     
                     
                  [This is the top-level
          folder] 
        <Your_Name>  =>
            Source        
                       
                   
                 [Place the source code in this
            subfolder]  
            <Your_Name>  => Executable
                         
                         
                 [Windows/Linux executable]  
              <Your_Name>  =>
                Artifact 
              <Your_Name>  =>
                  Artifact => index.html    
                         [Writeup about the
                  project, see below for details of what to put here.] 
                <Your_Name>  =>
                    Artifact => images/    
                              [Place
                    all your images used in the webpage here.] 
                      
                    <Your_Name> 
                      => Artifact => voting.jpg
                                
                      [One of your panorama images that you want to
                      submit for class-voting.] 
                      
                      In the artifact webpage, please put,
                      
                        -  A short description of what worked well and
                          what didn't. If you tried several variants or
                          did something non-standard, please describe
                          this as well. 
-  Describe any extra credit with supporting
                          examples. 
-  Include at least three panoramas.
                            
                            -  The test_set sequence 
-  Captured using the Kaidan head 
-  Captured by holding the camera in hand
                            
 
 Each panorama should be shown as -  
                            -  A low-res inlined image on the web
                              page. 
-  A link that you can click on to show
                              the full-resolution .jpg file. 
-  Embedded in a web viewer as described
                              above. 
 
If you are unfamiliar with HTML you can use a
                      simple webpage editor like NVU or KompoZer to
                      make your web-page. Here are some tips.
                    
        How to Turn In
        Create a zip archive of your submission folder -
          <Your_Name>.zip and place it the Catalyst
            submission dropbox before
            April 30, 11:59pm.
        
                      Extra Credit
                      Here is a list of suggestions for extending the
                        program for extra credit. You are encouraged to
                        come up with your own extensions. We're always
                        interested in seeing new, unanticipated ways to
                        use this program!
                      
                        - Although the feature-based aligner gives
                          sub-pixel motion estimation (because of least
                          squares), the motion vectors are rounded to
                          integers when blending the images into the
                          mosaic in BlendImages.cpp. Try to blend images
                          with sub-pixel localization.
- Sometimes, there exists exposure difference
                          between images, which results in brightness
                          fluctuation in the final mosaic. Try to get
                          rid of this artifact.
 
  
 
 
- Try shooting a sequence with some objects
                          moving. What did you do to remove "ghosted"
                          versions of the objects?
 
  
 
 
- Try a sequence in which the same person
                          appears multiple times, as in this example.
 
  
 
 
- Implement a better blending technique, e.g.,
                          pyramid
                            blending, poisson
                            imaging blending, or graph
                            cuts.