Project 1: Feature Detection and Matching

Susumu Harada

Feature Descriptor

I implemented several descriptors to handle translation, rotation, and change in illumination, but in the end only the translation descriptor ended up being reliable (the illuminatino descriptor is built into it). In all discussion to follow, I am dealing with a grayscale version of the original image (grayscale conversion performed using the predefined method in Convert.cpp, although I had to change two places in cse576.cpp where the image object for the original photo is being initialized to use 4 bands instead of 3 so that ConvertToGray will accept it).

Design Choices

For implementing the Harris Corner Detector, I decided to use the corner response measure as outlined in the handout (det(M)/Trace(M)) as opposed to the one in the paper (det(M)-k*Trace(M)^2), and it made quite a significant difference in the number of insignificant features that were showing up. For thresholding the corner response measure, I ended up using 0.003 after manual inspection of the values of the features I was seeing at various parts of the image (of course, I ended up changing this threshold quite a bit as I modified my Gaussian and derivative filters and etc.). From an implementation perspective, I chose to use the provided image libraries, which allowed for cleaner code but there was the downtime spent trying to debug the errors that were present.

I also used a local maxima thresholding to keep the number of immediately adjacent features low. After I get a list of features that passed the corner response measure threshold test, I sort them in descending order of the response measure and starting from the top, keep accepting a feature if it is farther than 10 pixels away from the already accepted set of features.

It was interesting how trying to use binning of the relative gradient orientation around a feature point lead to quite a bit of degraded performance. I expected it to perform equally well for handling translation, but the features were matching at quite a dispersed set of locations. I have not figured out whether this is due to simply an incorrect implementation, or whether there indeed is something about the features that are detected that cause them to be an actual likely match.

Performance

Below is the summary of performance of the simple window descriptor, my own discriptor, and SIFT features. Surprisingly, the scores came out much worse than I had expected for SIFT, and also for my own descriptor, given that through the interaction with the UI, features seem to get matched up much more accurately compared to the simple window descriptor. (I could not compare against the graf data set since the benchmark returned infinity for both of my two descriptors). However, I am pretty certain that these numbers are not "correct" and that I have a bug somewhere in my code, as I have heard that SIFT should be on the order of 10 to 20 pixel error.

Average Error (pixels) Simple Window Descriptor My Own SIFT
graf
293.9
314.5
255.4
leuven
376.7
204.9
129.81
wall
375.7
232.8
268.8

Table 1: Benchmark results for the absolute thresholding feature matching routine.

Average Error (pixels) Simple Window Descriptor My Own SIFT
graf
308.7
319.4
286.5
leuven
417.9
377.1
363.6
wall
385.2
336.2
348.9

Table 2: Benchmark results for the ratio thresholding feature matching routine.

Strengths and Weaknesses

My corner detector seems to be working quite well, hitting almost all the corners and not detecting spurious features in flat regions or straight edges. I started out using a simple [-1 0 1] filter for my derivatives, but found that it was finding many diagonal edges, so after switching to the Sobel filter, the spurious edges went away.

I have not tweaked the threshold for the feature matching based on the ratio of the best two matches.

My Own Images

I was surprised that my features were able to match the rotated stadium fairly well, even though I do not have a robust rotation invariance handling method. As expected, the tree leaves area did not match so well, and I expected the tree tops in the horizon to match better as well.