CSE 455 Project 3a: Feature Detection and Matching

Assigned: Friday, February 12
Due: Friday, February 19 (1:30 PM)

In this assignment, you'll write code to detect discriminating features in an image and find the best matching features in other images.  In the second part of the project, you'll be using the matched features to stitch images together to generate a panorama. The assignment has three main parts: feature detection, descriptor building and matching. 

 

Code to Write

1.      (15%) harrisResponse = computeHarrisResponse(I) 

 

The function takes as input a 3 channel image and computes the response of the Harris corner detector at each pixel. Recall that the Harris matrix at a pixel is defined as

http://www.cs.washington.edu/education/courses/cse455/08wi/projects/project2/web/project2_files/image002.jpg

where the summation is over all pixels p in a window centered around that pixel. The weights http://www.cs.washington.edu/education/courses/cse455/08wi/projects/project2/web/project2_files/image004.jpg should be chosen to be circularly symmetric for rotation invariance. A common choice is a 5x5 Gaussian mask. Note that H is a 2x2 matrix. The Harris response is simply the result of the following operator (the “Harris operator”) at each pixel location:

http://www.cs.washington.edu/education/courses/cse455/08wi/projects/project2/web/project2_files/image006.jpg

You may want to test on this checkerboard image.

 

1. (10%) [rows,cols] = getFeatureLocs( harrisResponse, threshold, neighborhoodSize)

 

Given the Harris response image computed above, the function finds all local maxima, i.e., locations where the value is greater than all values within a neighborhood of size (2*neighborhoodSize+1)x(2*neighborhoodSize+1), and that are also above threshold. It returns the indices of rows and columns of all such maxima.

 

2.      (5%) orientations = getOrientation( I, rows, cols)

 

For each of the locations computed above (specified in rows and cols), the function computes the direction of the gradient (in radians) at each location. This is necessary in order to compute a rotation invariant descriptor.

 

3.      (15%) patch = getRotatedPatch(im,row,col,orientation,halfWidth)

 

The function takes in the original 3 channel image, a single pixel location specified by (row,col), and orientation at that particular pixel and returns a 3 channel patch of size (2*halfWidth+1) X (2*halfwidth+1) centered around that pixel location and rotated by –orientation, i.e., the function compensates for rotation and will help you to build a rotation invariant descriptor in the next step. Use the MATLAB function interp2 to make your task easier.

 

4.      (20%) desc = getDescriptors(I,rows,cols,orientations)

 

The function takes in the original 3 channel image I, and the location and orientations of feature points and returns a matrix whose each column corresponds to the descriptor computed at the corresponding feature point location. Use the previous function to get a rotated patch centered on a feature point location. Come up with your own descriptor and its invariance properties. It’s up to you to decide the length of the feature vector, whether you use color information or grayscale information, whether you smooth the patch for robustness, etc.. A very simple (and bad) choice of descriptor could be to simply list the grayscale pixel values in the rotated patch.

 

5.      (15%) [match_idx, match_dist] = getMatches( desc1, desc2, methodType)

 

The function takes in two sets of feature descriptors and matches them. If methodType == 1, it uses the simple Euclidean distance between the feature vectors, otherwise it uses the ratio test.

match_idx is of the same length as the number of features in desc1 (number of columns in desc1) and stores the index of the nearest feature vector in desc2 corresponding to each feature vector in desc1. match_dist is of the same size size as match_idx and contains the corresponding nearest distances (or ratios) found.

 

Testing

 

1.      Download the example images here:  graf, yosemite, leuven, bikes, wall. They also contain homographies between pairs of images, i.e., given a feature point location in one image, one can apply the homography to the pixel location and get the corresponding location in the other image. Hence, it allows you to verify the feature matches between a pair of images. You can use this simple function to draw and display the matches. The correct matches are drawn in green while the incorrect ones are drawn in red. Here are the matches on the first two images from the graf dataset using a simple patch based descriptor. You should be able to get similar results if not better. And here are the matches on the same pair of images using David Lowe’s SIFT implementation (Yes, all those matches are correct, SIFT is that good!). You can download David Lowe’s SIFT implementation along with some MATLAB wrappers here. It only works in Windows/Linux and requires your getMatches function.

2.      Feature descriptors are often evaluated using ROC curves. Consider the match_dist array returned by getMatches function. Typically, one would choose a threshold and consider all matches below that threshold as valid matches. True Positives (TP) refers to the number of correct matches while the False Positives(FP) refers to the number of incorrect matches in the set of valid matches. As one varies the threshold, the number of TP and FP would vary. A desirable characteristic of a good descriptor and matcher is to have more TPs and less FPs. One can hence evaluate a descriptor by plotting how TP varies FP. For plotting, one typically uses TP rate and FP rate which are obtained by simply dividing the absolute numbers by the total number of positive matches and incorrect matches respectively. Higher the TP vs FP graph, the better is the descriptor + matcher. ROC curves are also helpful in choosing the right value for the threshold.

 

You can use this function to plot ROC curves for your descriptor (It returns FP and TP rates for different thresholds). Here are some ROC curves for the first two images in the graf dataset.

 

WriteUp (20%)

1.      Who are the members in your group and how did you divide the work?

2.      Describe your choice of descriptor and its invariance properties.

3.      Show Harris Response image for the checkerboard image and the first image in the graph dataset.

4.      Show matching results for at least three pairs of images.  Include a result where your descriptor does not work well enough and explain why.

5.      Plot ROC curves for the first two images in the graf dataset using the two different methods of matching (Euclidean distance and Ratio Test). On the same graph, also plot the ROC curves corresponding to the SIFT descriptor. Do the same for the  two images from the Yosemite dataset as well.

 

Extra Credit

·        (*) Implement sub-pixel refinement of feature point location.(MOPS paper)

·        (*) Implement adaptive non-maximum suppression (MOPS paper)

·        (**) Make your feature detector and descriptor scale invariant.

·        (Variable) Use a fast algorithm to speed up the matching process. You can use code from the web or write your own (with extra credit proportional to effort). It is possible to call C/C++ code from MATLAB. For more information, see here. Some possibilities in rough order of increasing difficulty: k-d trees, wavelet indexing (MOPS), locality-sensitive hashing.

 

Submission

To turn in your project, submit a single zip file to the dropbox at:

https://catalysttools.washington.edu/collectit/dropbox/iansimon/8903

The zip file should be named project3a.zip and contain all of your M-files and your assignment writeup as an HTML, PDF, or Word document. Make sure you include all the images you used for face finding and cropping (other than the provided cropped images), and your plots. Also remember the average face and eigenfaces in the first experiment. Make sure any images that need it are normalized so they can be seen clearly. Finally, make sure to fully document any extra credit on this web page.