CSE 455, Autumn 2010

Homework 5:  Feature Detection and Matching

Date released:  Monday, Nov 8th, 2010

Date due: Wednesday, Nov 24th, 2010 11.59pm

(Late policy: 5% off per day late till Nov 28 )

Synopsis

In this project, you will write code to detect discriminating features in an image and find the best matching features in other images.  You will use the Harris corner detector to find features and will try a color descriptor for matching.

To help you visualize the results and debug your program, we provide a working user interface that displays detected features and best matches in other images.  We also provide sample feature files that were generated using SIFT, the current best of breed technique in the vision community, for comparison. You are NOT expected to implement SIFT.

Extra Documentation

Since this assignment is somewhat complex, we are providing extra documentation and a working Windows executable from Alfred.

Description

The project has three parts. First, you will write the interest point detection function. Next you will form descriptor vectors for each of your interest points. Finally, using the skeleton code provided, you will use your feature detector/descriptor to find matches in an image database, and evaluate the results.

The Harris operator works with gray-scale images, so the input images will need to be converted. The skeleton code has the function ConvertToGrey in features.cpp to convert the images to grayscale. The following instructions are guidelines; you should feel free to experiment, too.

Interest point detection

In this step, you will identify points of interest in the image using the Harris corner detection method. 

  1. Convert the RGB image into a grey-level image.
  2. Apply the following 9 x 9 Gaussian filter to the whole image

    0.0008 0.0018 0.0034 0.0050 0.0056 0.0050 0.0034 0.0018 0.0008
    0.0018 0.0044 0.0082 0.0119 0.0135 0.0119 0.0082 0.0044 0.0018
    0.0034 0.0082 0.0153 0.0223 0.0253 0.0223 0.0153 0.0082 0.0034
    0.0050 0.0119 0.0223 0.0325 0.0368 0.0325 0.0223 0.0119 0.0050
    0.0056 0.0135 0.0253 0.0368 0.0417 0.0368 0.0253 0.0135 0.0056
    0.0050 0.0119 0.0223 0.0325 0.0368 0.0325 0.0223 0.0119 0.0050
    0.0034 0.0082 0.0153 0.0223 0.0253 0.0223 0.0153 0.0082 0.0034
    0.0018 0.0044 0.0082 0.0119 0.0135 0.0119 0.0082 0.0044 0.0018
    0.0008 0.0018 0.0034 0.0050 0.0056 0.0050 0.0034 0.0018 0.0008

    Calling the function ConvolveGaussian(greyImage, img_blurred, 2.0f) should blur greyImage with this mask to img_blurred

  3. For each point in the image, consider a 5 X 5 window W of pixels around that point. Compute the Harris Matrix M for that point, defined as M =

    Σ I_x^2   &Sigma I_xI_y

    Σ I_xI_y   &Sigma I_y^2

    where the summations are over all pixels (x,y) in the window W. (See slide 18 of Harris lecture.) To find interest points, compute the corner response R.

    R=det M - k(trace M)^2
    (Try k=0.05)

    Once you've computed R for every point in the image, choose points where R is above a threshold and is a local maximum in a 3x3 neighborhood. (Start with a threshold of 0.001, but this is a sensitive threshold, so you need to experiment.)
    You can test that your features are firing in sensible locations by loading the featurefile generated from cse455 computeFeatures into the GUI (see Testing section).

    Feature description

    Now that you've identified points of interest, the next step is to implement a descriptor for the feature centered at each interest point.  This descriptor will be the representation you'll use to compare features in different images to see if they match.

    1. For the feature description, please go back to RGB color space.
    2. Take a 45 x 45 window centered at a feature point.
    3. Divide the 45 x 45 window evenly into 9 x 9 = 81 squares of size 5 x 5. For every 5 x 5 square, apply the following 5 x 5 mask

      0.0369 0.0392 0.0400 0.0392 0.0369
      0.0392 0.0416 0.0424 0.0416 0.0392
      0.0400 0.0424 0.0433 0.0424 0.0400
      0.0392 0.0416 0.0424 0.0416 0.0392
      0.0369 0.0392 0.0400 0.0392 0.0369

      to each color channel. A 3-dimensional vector is obtained from summing up all entries in this 5 x 5 square for each color channel separately. Concatenating all 81 such 3-dimensional vectors from all individual 5 x 5 squares in the whole 45 x 45 window, a 9 x 9 x 3 dimensional vector is constructed. Now take that 243-dimensional vector and produce a 243-dimensional normalized feature vector, which is the original feature vector divided by the L2 norm of the original vector. The L2 norm of a vector x = (x1,...,xn) is the square root of the sum of the squares of its components, ie. L2-norm = sqrt(&Sigma xi^2, i=1,...,n).

    4. Get feature vectors for all other feature points in the same way and write them to a data file. (Note: the skeleton code does the I/O for you.)

    Feature matching

    Now that you've detected and described your features, the next step is to match them, i.e., given a feature in one image, find the best matching feature in one or more other images.

    Skeleton code has been provided that finds the SSD between all feature descriptors in a pair of images. The code declares a match between each feature and its best match (nearest neighbour) in the second image. We have also provided ground truth homographies between the images, so that you can test how well your matching is working using cse455 testMatch (see Testing section).

    Testing

    Now you're ready to go!  Using the UI and C++ skeleton code that we provide, you can load in a set of images, view the detected features, and visualize the feature matches that your algorithm computes.

    We are providing a set of benchmark images to be used to test the performance of your algorithm as a function of different types of controlled variation (i.e., rotation, scale, illumination, perspective, blurring).  For each of these images, we know the correct transformation and can therefore measure the accuracy of each of your feature matches.  This is done using a routine that we supply in the skeleton code.

    1. Download some image sets: leuven, bikes
      Included with these images are

      • SIFT feature files, with extension .key
      • database files used by the skeleton code, .kdb
      • homography files, containing the correct transformation between each pair of images, named HXtoYp where X and Y are the indeces of the images. For example, the transformation between img2.ppm and img3.ppm is found in H2to3p.

    2. Get access to FLTK. (Here are local copies for windows and linux, but see below about using it on linux.) **disclaimer** the skeleton code was designed to be used with FLTK-1. It has not been tested with FLTK-2. The most efficient way to get FLTK to work is installing the local copies instead of the one downloaded from the FLTK website. Make sure all include and library files are linked to your project. If you're using Linux on one of the CSE department machines, you don't need to download FLTK, since you can just use the libraries already installed and accessible from your machine at the location specified in the Makefile. If you already have FLTK or are installing it in /usr/local, here are fltk installation instructions and an alternate Makefile for the skeleton code.

    3. Download the C++ skeleton code:
      For Visual Studio 2008
      For Linux

    After compiling and linking the skeleton code, you will have an executable cse455. This can be run in several ways:

    • cse455
      with no command line options starts the GUI. Inside the GUI, you can load a query image and its corresponding feature file, as well as an image database file, and search the database for the image which best matches the query features. You can use the mouse buttons to select a subset of the features to use in the query. To select a rectangular region, click and drag the right mouse button over some of the feature locations. If you want to test SIFT features, please start by loading a query image, "File->Load Query Image" e.g. img1.ppm and query features "File->Load Query Features" e.g. img1.key. Next, load a feature database "File->Load Image Database" e.g. img2.kdb. Finally, select a set of features and use "Image->Perform Query" to show the matches. Note that a database file is really just a link to the image and its corresponding features i.e. img1.kdb contains something like "img1.ppm im1.key". If you want to test your own features, please load feature1.f and img2simple.kdb instead of img1.key and img2.kdb. Graphical illustration of UI can be seen in Extra Powerpoint Documentation.

    • cse455 computeFeatures imagefile featurefile [featuretype]
      uses your feature detection routine to compute the features for imagefile, and writes them to featurefile. featuretype can be used to specify different types of features to compute (optional).I suggest you to name feature files as feature1.f,feature2.f,...feature6.f for the convenience of database query, otherwise you should change the link in corresponding database file.

    • cse455 testMatch featurefile1 featurefile2 homographyfile [matchtype]
      uses your feature matching routine to match the features in featurefile1 with the features in featurefile2. homographyfile contains the correct transformation between the points in the two images, specified by a 3-by-3 matrix. matchtype can be used to specify different matching algorithms to use (optional). This function returns the percentage of correct matches (number of inliers / number of feature matches).

    • cse455 testSIFTMatch featurefile1 featurefile2 homographyfile [matchtype]
      is the same as above, but uses the SIFT file format.

    • cse455 benchmark imagedir [featuretype matchtype]
      tests your feature finding and matching for all of the images in one of the four above sets. imagedir is the directory containing the image (and homography) files. This command will return the average percentage correct matches when matching the first image in the set with each of the other five images.

    To Do

    The only code you need to write is for your feature detection and description methods. You only need modify the functions ComputeHarris() and ExtractDescriptor() at the end of the file "features.cpp". These functions are called from "dummyComputeFeatures()" in the same file. 

    Extra Credit

    • Try to develop a better descriptor that might select the best scale at an interest point, use multiple scales at once, or be invariant to rotations. In order to demonstrate this work, please try other two image sets also graf; wall
    • Try to develop a better evaluation of matching between different feature points. You may declare no matches found for a feature point if (score of the best feature match)/(score of the second best feature match) is larger than a threshold. However, please make sure there are at least 20 matches found between images for comparison. Also, even if you've implemented another evaluator, the evaluation function in the skeleton code should also be enabled in your program and matching results coming from the original evaluation function should be included in the report.

    What to Turn In

    Download the .doc template for the report here. You are free to use other text processing tools like latex etc, however make sure that you have the same sections in your report.
    The grading guidelines document for Homework 5 is here.

    In addition to your source code and executable, turn in a report describing your approach and results.  In particular:

    • Describe your feature descriptor in enough detail that someone could implement it from your write-up. If your descriptor is exactly the one given in the assignment, you may say so.
    • Explain why you made the design choices that you did.
    • Report the performance on the provided benchmark image sets. Show the data for each of the pairs. Results quality are expected to decline for higher image numbers; consider graphing this decline.
    • Compare the performance of the simple window descriptor and SIFT features (provided)
    • Describe strengths and weaknesses
    • Describe any extra credit items that you did (if applicable)

    This report can be a Word document or pdf document.

    Please keep your homework package in reasonable size

    Your working folder for this project can be very large, so please don't send everything to me. Make sure that Debug folders(including the one in ImageLib folder), images, descriptor files and cse455.ncb are not sent. It is highly recommended that your code folder is smaller than 1MB.

    Email Alfred (alfredg@cs.washington.edu) writeup and your source code by Wednesday, Nov 24, 2010 11:59pm. Zip up your report, source code and images into a file named "cse455_Your last name.