CSE 576: Computer Vision

Project 1: Feature Detection and Matching

Dates

Assigned: Wednesday, April 3, 2013
Due: Thursday, April 18, 2013 (11:59pm)

Quick Links

Comparative class results (AUC rankings) for the project

Synopsis

In this project, you will write code to detect discriminating features in an image and find the best matching features in other images. Your features should be reasonably invariant to translation, rotation, illumination, and scale, and you'll evaluate their performance on a suite of benchmark images. We'll rank the performance of features that students in the class come up with, and compare them with the current state-of-the-art.

In project 2, you will apply your features to automatically stitch images into a panorama.

To help you visualize the results and debug your program, we provide a working user interface that displays detected features and best matches in other images. We also provide sample feature files that were generated using SIFT, the current best of breed technique in the vision community, for comparison.

Description

The project has three parts: feature detection, description, and matching.

Feature detection

In this step, you will identify points of interest in the image using the Harris corner detection method. The steps are as follows (see the lecture slides/readings for more details) For each point in the image, consider a window of pixels around that point. Compute the Harris matrix H for that point, defined as

where the summation is over all pixels p in the window. The weights should be chosen to be circularly symmetric (for rotation invariance). A common choice is to use a 3x3 or 5x5 Gaussian mask. Note that these weights were not discussed in the lecture slides, but you should use them for your computation.

Note that H is a 2x2 matrix. To find interest points, first compute the corner strength function

Once you've computed c for every point in the image, choose points where c is above a threshold. You also want c to be a local maximum in at least a 3x3 neighborhood.

Feature description

Now that you've identified points of interest, the next step is to come up with a descriptor for the feature centered at each interest point. This descriptor will be the representation you'll use to compare features in different images to see if they match.

For starters, try using a small square window (say 5x5) as the feature descriptor. This should be very easy to implement and should work well when the images you're comparing are related by a translation.

Next, try implementing a better feature descriptor. You can define it however you want, but you should design it to be robust to changes in position, orientation, and illumination. You are welcome to use techniques described in lecture (e.g., detecting dominant orientations, using image pyramids), or come up with your own ideas.

Feature matching

Now that you've detected and described your features, the next step is to write code to match them, i.e., given a feature in one image, find the best matching feature in one or more other images. This part of the feature detection and matching component is mainly designed to help you test out your feature descriptor. You will implement a more sophisticated feature matching mechanism in the second component when you do the actual image alignment for the panorama.

The simplest approach is the following: write a procedure that compares two features and outputs a distance between them. For example, you could simply sum the absolute value of differences between the descriptor elements. You could then use this distance to compute the best match between a feature in one image and the set of features in another image by finding the one with the smallest distance. Two possible distances are:

1. Use a threshold on the match score. This is called the SSD distance, and is implemented for you as match type 1, in the function sshMatchFeatures.
2. Compute (score of the best feature match)/(score of the second best feature match). This is called the "ratio test"; you must implement this distance.

Testing

Now you're ready to go! Using the UI and skeleton code that we provide, you can load in a set of images, view the detected features, and visualize the feature matches that your algorithm computes.

We are providing a set of benchmark images to be used to test the performance of your algorithm as a function of different types of controlled variation (i.e., rotation, scale, illumination, perspective, blurring). For each of these images, we know the correct transformation and can therefore measure the accuracy of each of your feature matches. This is done using a routine that we supply in the skeleton code.

You should also go out and take some photos of your own to see how well your approach works on more interesting data sets. For example, you could take images of a few different objects (e.g., books, offices, buildings, etc.) and see if it can "recognize" new images.

Project package

Download the complete project package here.

The package includes the following components:

FeaturesSkel. The skeleton code which you will work on. We support Windows (load using Visual Studio) and Linux platforms (use the provided Makefile). If you are interested in a Mac patch, you can download Rich's MacDonald patch from a previous course offering.
ImageSets/ROC: A couple of image sets - graf and yosemite. Included with these images are some SIFT feature files and image database files.
ImageSets/Benchmark: Four image sets for benchmark testing - graf, leuven, bikes and wall. Included with these images are some SIFT feature files and image database files.
FeaturesSampleSoln.exe: An executable for 64-bit Windows platform for sample solution.
FeaturesSampleSoln32: An executable for 32 bit Linux platform.
FeaturesSampleSoln64: An executable for 64 bit Linux platform.
GNUscripts: A couple of scripts - plot.ros and plot.threshold - that you will use to plot ROC curves. Usage is mentioned later on this page.

After compiling and linking the skeleton code, you will have an executable Features This can be run in several ways:

Features
Running with no command line options starts the GUI.

Inside the GUI, you can load a query image and its corresponding feature file, as well as an image database file, and search the database for the image which best matches the query features.
SIFT feature files are named as *.key and the corresponding database files are named as *.kdb. These can be found in the respective image set folders.
Your feature files should be named as *.f and the corresponding database files should be named as *.db. The format of the database files is the same as that of SIFT database files. You can use text editor to view and edit these files.
You can use the mouse buttons to select a subset of the features to use in the query.
Until you write your feature matching routine, the features are matched by minimizing the SSD distance between feature vectors.

Features computeFeatures imagefile featurefile [featuretype]
Uses your feature detection routine to compute the features for imagefile, and writes them to featurefile.

featurefile should be named as *.f.
featuretype specifies which of your types of features (if you choose to implement another feature for extra credit) to compute. Default value is 1 (corresponds to the dummy feature descriptor given to you in the code which does not do anything meaningful). So in all your tests, this value should ideally be greater than 1 and should correspond to the function calls in the function computeFeatures in features.cpp.

Features matchFeatures featurefile1 featurefile2 threshold matchfile [matchtype]
takes in two sets of features, featurefile1 and featurefile2 and matches them using your matching routine.

The matching routine to use is selected by [matchtype]; 1 (SSD) is the default).
The threshold for determining which matches to keep is given by threshold. The results are written to a file, matchfile, which can later be read by the Panorama program.

Features matchSIFTFeatures featurefile1 featurefile2 threshold matchfile [matchtype]
same as above, but uses SIFT feature files (*.key).
Features roc featurefile1 featurefile2 homographyfile [matchtype] rocfile aucfile
creates the points necessary for the ROC curve for the feature you implement and writes it to rocfile. You will use these values to create plots for your ROC curves. It also computes the area under the ROC curve (AUC) and writes it to aucfile.

featurefile1 and featurefile1 correspond to your feature (named as *.f).
The matching routine to use is selected by [matchtype]; 1 (SSD) is the default).
homographyfile is found in the respective image set folders. Note that the order of featurefiles is important. If you use the homography file - H1to3p - then, featurefile1 should be for image1 and featurefile2 should be for image3 in the image set.

Features rocSIFT featurefile1 featurefile2 homographyfile [matchtype] rocfile aucfile
is the same as above, but uses the SIFT features (*.key).
Features benchmark imagedir [featuretype matchtype]
tests your feature finding and matching for all of the images in one of the four above sets. This command will return the average pixel error and average AUC when matching the first image in the set with each of the other five images.

imagedir is the directory containing the image (and homography) files.
Default values of featuretype is 1, and for matchtype is 1.

To Do

We have given you a number of classes and methods to help get you started. The only code you need to write is for your feature detection methods and your feature matching methods, all in features.cpp. You should modify computeFeatures and matchFeatures in the file features.cpp to call the methods you have written. We have provided a function dummyComputeFeatures that shows how to create the code to detect and describe features, as well as integrate it into the system. The function ssdMatchFeatures implements a feature matcher which uses the SSD distance, and demonstrates how a matching function should be implemented. The function ComputeHarrisFeatures is the main function you will complete, along with the helper functions computeHarrisValues and computeLocalMaxima. You will also implement the function ratioMatchFeatures for matching features using the ratio test.

You will also need to generate plots of the ROC curves and report the areas under the ROC curves (AUC) for your feature detecting and matching code (using the 'roc' option of Features.exe), and for SIFT. For both the Yosemite test images (Yosemite1.jpg and Yosemite2.jpg), and the graf test images (img1.ppm and img2.ppm), create a plot with six curves, two using the simple window descriptor and your own feature descriptor with the SSD distance, two using the simple window descriptor and your own feature descriptor with the ratio test distance, and the other two using SIFT (with both the SSD and ratio test distances; these curves are provided to you in the Project Package, ImageSets/ROC/ ).

We have provided scripts for creating these plots using the 'gnuplot' tool. Gnuplot is installed on the lab machines, at 'C:\Program Files\gnuplot\binaries\wgnuplot.exe' ('gnuplot' is also available on 'attu', the CSE instructional machine) or you can download a copy of Gnuplot to your own machine. To generate a plot with gnuplot (using a gnuplot script 'script.txt', simply run 'C:\Program Files\gnuplot\binaries\wgnuplot.exe script.txt', and gnuplot will output an image containing the plot. The two scripts we provide are:

plot.roc.txt: plots the ROC curves for the SSD distance and the ratio test distance. These assume the two roc datafiles are called 'roc1.txt' (for the SSD distance), and 'roc2.txt' (for the ratio test distance). You will need to edit this script if your files are named differently. This script also assumes 'roc1.sift.txt' and 'roc2.sift.txt' are in the current directory (these files are provided in the zip files above). This script generates an image named 'plot.roc.png'. Again, to generate a plot with this script, simply enter 'C:\Program Files\gnuplot\binaries\wgnuplot.exe plot.roc.txt'.

plot.threshold.txt: plots the threshold on the x-axis and 'TP rate - FP rate' on the x-axis. The maximum of this function represents a point where the true positive rate is large relative to the false positive rate, and could be a good threshold to pick for the computeMatches step. This script generates an image named 'plot.threshold.png'.

Finally, you will need to report the average AUC for your feature detecting and matching code (using the 'benchmark' option of Features.exe) on four benchmark sets : graf, leuven, bikes and wall.

What to Turn In

Please organize your submission in the following folder structure.

<Your_Name>                                                 [This is the top-level folder]
<Your_Name> => Source                              [Place the source code in this subfolder]
<Your_Name> => Executable                          [Windows/Linux executable]
<Your_Name> => Artifact
<Your_Name> => Artifact => index.html          [Writeup about the project, see below for details of what to write here.]
<Your_Name> => Artifact => images/             [Place all your images used in the webpage here.]

In the artifact webpage,

Describe your feature descriptor in enough detail that some one could implement it from your write-up.
Explain why you made the major design choices that you did
Report the performance (i.e., the ROC curve and AUC) on the provided benchmark image sets.
- You will need to compute two sets of 6 ROC curves and post them on your web page as described in the above TO DO section. You can learn more about ROC curves from the class slides and on the web here and here.
- For one image each in both the Yosemite and graf test pairs, please include an image of the Harris operator on your webpage. This image is produced by the Features.exe executable every time it is run in computeFeatures mode (it is saved to the file harris.tga). If these images seem too dark, you can multiply the Harris score by a constant, such as 2, to make the scores more visible.
- Report the average AUC for your feature detecting (both simple 5x5 window descriptor and your own new descriptor) and matching code (both SSD and ratio tests) on four benchmark sets -- graf, leuven, bikes and wall.
Describe strengths and weaknesses of the algorithms that you implemented. Please elaborate how these vary with the different image datasets.
Take some images yourself and show the performance (include some pictures on your web page!)
Describe any extra credit items that you did (if applicable) with their sample results.

We'll tabulate the best performing features and present them to the class.

If you are unfamiliar with HTML you can use a simple webpage editor like NVU to make your web-page. Here are some tips.

How to Turn In

Create a zip archive of your submission folder - <Your_Name>.zip and place it the Catalyst submission dropbox before April 16, 11:59pm.

Extra Credit

Here is a list of suggestions for extending the program for extra credit. You are encouraged to come up with your own extensions as well!

IMPORTANT : After implementing the whistles, the code's executable should still be usable in the above mentioned usage modes. Please provide information about any additional usage modes in a separate README file in your submission folder.

Implement sub-pixel refinement of pixel positions. (MOPS paper)
Implement adaptive non-maximum suppression. (MOPS paper)
Make your feature more contrast invariant. This was discussed in lecture.
Make your feature detector scale invariant.
Implement a method that outperforms the above ratio test for deciding if a feature is a valid match.
Use a fast search algorithm to speed up the matching process. You can use code from the web or write your own (with extra credit proportional to effort). Some possibilities in rough order of difficulty: k-d trees (code available here), wavelet indexing (approach from lecture), locality-sensitive hashing.
Try implementing a better feature descriptor. You can define it however you want, but you should design it to be robust to changes in position, orientation, and illumination. You are welcome to use techniques described in lecture (e.g., detecting dominant orientations, using image pyramids), or come up with your own ideas For this extra credit you'll need to compare it with the other features using the following function:

Features benchmark imagedir [featuretype matchtype]
tests your feature finding and matching for all of the images in one of the four benchmark sets. The param imagedir is the directory containing the image (and homography) files. This command will return the average pixel error when matching the first image in the set with each of the other five images. This will be used for the extra credit if you choose to do that.

Computer Vision

CSE 576, Spring 2013