Jeffrey Bigham

Project 1

WriteUp

 

 

 

Approach

 

Feature Detector  -

The feature detector that I used is the standard Harris Feature Detector that was described on the project webpage.

 

Feature Descriptor –

I chose to closely base my feature descriptor on the Lowe feature descriptor described in the “Distinctive Image Features from Scale-Invariant Keypoints” by David Lowe with a few added features of my own choosing.  All of my implementation details for the Lowe-based part, minus the scaling strategies are listed in his paper.  A discussion of the scaling that I used is discussed in the Extra Credit section below.  The extra features that I used were simply to measure the average strength of each color and normalize that by the total.

 

In summary, the Lowe Feature Descriptor consists of a summary of the histogramed orientation information of the gradients in a 16x16 window around the feature point.  To make the features rotation invariant, an overall orientation is assigned.  Multiple feature points are used if multiple bins in the histogram are within 80% of the peak value.  The descriptor itself consists of a 4x4 grid of histogramed local orientation information split into 8 bins with each point contributing to both the bin that its gradient orientation is in as well as a scaled amount to the neighboring bins.  This allows the information to be summarized but marginalizes edge effects that might otherwise occur.

 

Major Design Decisions –

 

Other than the choice of feature descriptor as described above, the major design decision that I had to make was how to tune the parameters.  This seemed like a black art to me because it was obvious that my program was not going to work if I didn’t tune it correctly and, more troublesome, the program didn’t work well with the parameter values suggested directly in the paper Lowe paper.  For instance, for the second round of binning used in the construction the feature descriptor the Lowe paper suggests weighting each sample magnitude/direction by a standard deviation of half the overall window size which in this case would be 8, whereas I found much better performance if I used a standard deviation of 1.5.

 

 

Results

          Benchmark Tests –

           

 

Score Threshold

Score Ratio Threshold

Bikes

174.06

411.68

Graf

339.44

334.08

Leuven

79.94

375.20

Wall

342.36

389.09

 

I assume that the ratio-based threshold was supposed to perform better, but I didn’t find that.

 

Feature Descriptor Comparison –

 

I compared the feature descriptors that I used on the Leuven benchmark because it was the one that my implementation performed the best on.

 

Simple Window:  352.98

Mine:                   79.94

SIFT:                   ?

 

Bunny Picture

 

This is a picture of a bunny that I took recently at GreenLake.  My feature detector found all sorts of interesting points…unfortunately, only a few were actually on the bunny.

 

 

Conclusions

 

          Strengths and Weaknesses –

 

It’s hard for me to get a good sense of what level of accuracy means good performance.  It seems to work fairly well, but the numbers on the benchmarks seem troubling.  Then again the benchmarks are most likely designed to be rather difficult.  In general the method that I implemented should be fairly tolerant to scale, rotation, and illumination, but it doesn’t seem to be here.  Some reasons for that might include that I wasn’t able to tune the parameters well enough, or that there are bugs in my program.  I used some test images with known derivatives and known rotations and my methods seemed to work there but that performance didn’t extend to these more complicated examples.

 

 

Extra Credit

 

Scale Invariance -

I implemented scale invariance and this is accessible via featuretype number 3.  This simply runs my algorithm on images that are Gaussian blurred once at each of four levels, calculates all features and descriptors with respect to this scale, and then adds all of the features obtained in this way naively.  This improves the performance on the bikes benchmark from around an average error of 350 pixels to the current level of 174.

 

 

Fast Search Algorithm –

Mostly because I was getting tired of waiting for my program to finish while I was in the process of tuning my parameters I implemented a faster searching algorithm using the k-d trees code from the project webpage.  Versions of my normal and ratio matches using this faster search are implemented as matches      

 

The speedup was fairly modest but measurable over the exhaustive brute-force Euclidian distance method.  The library allows you to specify and error term and I wonder if I had increased that if I would have seen more speedup.

 

Valid Match Test –

I used to two different feature descriptors in combination and used a simple weighted average of their costs to compute my final match score and do the match test.  I gave my color ratio descriptor one sixteenth the weight of the orientation/bin approach.  The way that I chose the weights was pretty ad hoc but I could imagine building a more sophisticated system that not only adjusted the weights pseudo-autonomously if given some training data but that would also outperform my hand-tuning.