Assignment
3: Stereo
Before you start:
Download the assignment's files here.
The project is graded 1/2 written assignment and 1/2 code assignment.
Written Assignment:
Once you have finished the coding assignment (described next), you'll need to finish the written assignment. The written assignment is in the file "Written Assignment 3.docx".
Coding Assignment:
For this assignment you'll be implementing a stereo algorithm for computing disparities between images. This requires the implementation of several image match scores, methods for smoothing or removing the noise from the match scores, a segmentation algorithm, and a method for finding the best disparity for each pixel. Feel free to reuse any code from previous assignments.
In the project, you should only have to update one file "Project3.cpp. " The buttons in the UI will call the corresponding functions in "Project3.cpp. " For the stereo datasets with ground truth disparities, the percentage of wrong pixels will be displayed above the image. Please select different tabs to see different images. Slices of the match costs may be viewed by left clicking on the image (they are displayed below the image. )
What to turn in:
To receive credit for the project, you need to turn in the completed file "Project3.cpp", any additional files needed for compiling, the requested images for each task, and the written assignment.
Tasks:
Step 1: Implement SSD, SAD and NCC match cost functions. For SSD and SAD compute the squared distance and absolute distance between the pixels in color (it's simply SD/AD in our case since we are only do it pixel-wise). For NCC (normalized cross correlation) compute the score over a window with the supplied radius. Store the computed match cost in "matchCost". Each entry in matchCost is the distance in color space between (c, r) in image 1 and (c - d, r) in image 2. Store these values at "matchCost[(d - minDisparity)*w*h + r*w + c]". Hint: For SSD you may use sqrt of the squared response. The NCC score is high for good matches and low for bad matches, so store one minus the NCC score instead of the NCC score in matchCost.
Implement "FindBestDisparity" which finds the disparity with minimum cost for each pixel and stores it in the variable "disparities".
Required: Open "cones.txt". Press SSD, and Find Best Disparity. Save the "Disparity" image as "1a.png". Press SAD, and Find Best Disparity. Save the "Disparity" image as "1b.png". Open "conesGainOffset.txt". Press SSD, and Find Best Disparity. Save the "Disparity" image as "1c.png". Press NCC with Radius = 2, and Find Best Disparity. Save the "Disparity" image as "1d.png".
Step 2: Smooth and denoise the match cost values. Implement GaussianBlurMatchScore to blur the match costs at each disparity, i.e. , Gaussian blur the match cost for each disparity. You may use SeparableGaussianBlurImage as a helper function. Implement BilateralBlurMatchScore to use a bilateral filter to blur the match costs at each disparity. The weights or kernel used in the bilateral filter should be computed from the color image. Using the computed kernels, blur the match costs at each disparity. Hint: When computing the bilateral filter, compute the kernel for a pixel and apply the computed kernel to all disparities. If you re-compute the kernels at each disparity level your code will be very very slow.
Required: Open "cones.txt". Press SSD, press Gaussian, and Find Best Disparity. Repeat this process and change the value of Sigma S until the error is below 14.0. Save the corresponding "Disparity" image as "2a.png". Press SSD, press Bilateral, and Find Best Disparity. Repeat this process and change the value of Sigma S and Sigma I until the error is below 13.0. Save the corresponding "Disparity" image as "2b.png".
Step 3: Implement k-means clustering. The function Segment has already been implemented, along with GridSegmentation that initializes the segmentation. Please examine Segment to see how the k-means clustering algorithm works by iteratively updating the segment means and assigning pixels to segments. You'll need to implement three helper functions:
1. ComputeSegmentMeans - Compute the mean position and color for each segment.
2. AssignPixelsToSegments - Assign each pixel to its closest segment. The distance between a pixel and segment is measured in a 5 dimensional space, position (2) and color (3). Use the Mahalanobis distance (see Wikipedia or Rick's book) when computing the closest segments with standard deviations of spatialSigma and colorSigma for position and color respectively.
3. SegmentAverageMatchCost - Given the computed segmentation, average the match costs within each segment. That is, each pixel in the segment should have the same match cost for each disparity.
Required: Open "cones.txt". Press SSD, press Segment, and Find Best Disparity. Repeat this process and change the value of Sigma Spatial, Sigma Color, Grid Size until the error is below 17.5. Save the corresponding "Disparity" image as "3a.png" and the "Segments" image as "3b.png".
Bell and Whistles (extra credit)
(Whistle = 1 point, Bell = 2 points)
Compute one of the 20 best disparities in the class on "cones.txt" as measured by the error score. Please briefly describe your approach, record your error score, and save the disparity and error image. Use any approach you please (in the literature or something you thought up), and implement using the "Magic Stereo" button.
Compute one of the 10 best disparities in the class on "cones.txt" as measured by the error score. Please briefly describe your approach, record your error score, and save the disparity and error image. Use any approach you please (in the literature or something you thought up), and implement using the "Magic Stereo" button.
Compute one of the 5 best disparities in the class on "cones.txt" as measured by the error score. Please briefly describe your approach, record your error score, and save the disparity and error image. Use any approach you please (in the literature or something you thought up), and implement using the "Magic Stereo" button.
Improve the "Render" function to use both image 1 and image 2 when rendering. That is, use image 2 to fill in the holes in the rendering of image 1. To do this you'll need to compute disparities for image 2 as well as image 1.