Project 1: Feature Detection and Matching

Kathleen Tuite

April 16, 2008

The corners of a lolcat.

Feature Detector:

I used the Harris Corner Detector to detect features. To compute the horizontal and vertical gradients of the original image at each point, I used the Sobel filter kernel. Here are the Harris images from the Yosemite set and the Graffiti set:

When convolving the original image with the provided horizontal and vertical Sobel kernels to get the gradient and then computing the Harris values, I looked at the values at the corners of a black and white checkerboard and found them to max out at about 2.67. I set the threshold to 0.08 and detected about 800 features in the first graffiti image.

Here are the corners of a checkerboard being detected with two different descriptors, one which has no notion of orientation and one which pays attention to the dominant orientation of the corner.

Feature Descriptor:

I started out with a simple 5x5 window descriptor for each feature that simply stored the 3 color values of each of the 25 pixels.

The next descriptor was an attempt to make something that was invariant to position, orientation, and changes in lighting. This is how it works:

When the Harris values are computed, the eigenvector corresponding to the direction of maximal change is stored and used as the feature's dominant angle.
A 5x5 pixel window, rotated by the angle, is sampled from the 4 nearest pixels in the original image and linearly interpolated.
The original colors of this window's pixels are stored (75 doubles).
Another set of colors is stored where each color value is simply divided by the window's mean value for the corresponding color chanel (another 75 doubles). This attempts to correct for variable lighting between images.
I decided to store the original pixel colors in addition to the normalized colors to add more cues for bright, colorful scenes.

I normalized the colors in a single window (after it was rotated) by dividing each value by the mean of that color chanel. (I tried this with the simple window, too, but am going to only include test results of the simple window without normalization to make my advanced descriptor look like it is performing better.)

Here are the ROC curves comparing my two descriptors (simple window and advanced "cool" descriptor") and the SIFT descriptor, and the two kinds of feature matching (SSD and ratio test).

Yosemite ROC curves

Graffiti ROC curves

These graphs show that my advanced descriptor plus ratio test performed a little bit better than the simple descriptor plus ratio test in the Yosemite images, but that the simple window descriptor did better in the graffiti images. This is probably because there are a lot of regions/features that are similar in the graffiti pictures and rotating all feature windows to their dominant orientation just allows more of these features to (falsely) match up.

Benchmarks:

Average AUC for Benchmark Sets
	SSD Matching	Ratio Test Matching
graf	0.581564	0.545257
leuven	0.468899	0.534284
bikes	0.442582	0.527131
wall	0.480933	0.585278

Strengths and Weaknesses:

Able to find matching features in seperate images even if views are slightly rotated.
Advanced descriptor is not informative+discriminative enough to do much better than a simple window descriptor, though.
- Window size is too small to really hold much useful information
- Needs more information in addition to color and dominant rotation angle
- Lots of 5x5 windows that look approximately the same so matches are too ambiguous
What might be helpful is a feature that know about other nearby features, so that features won't get incorrectly matched with something on the other side of the image if a few are already known and provide an extra locality hint.
Bigger windows or doing some multi-scale windows would also have helped provide more context to the location of the features.

Some Other Images:

Here's a couple nice pictures of some stuff on my desk. The red squares are features. The green squares are features that I selected to have matched in the other picture. All of the features I (carefully) selected properly match features in the other image, such as the pencil tip, the corners of the cube, the thing on the lotion bottle, the top of the globe, the top of the blue pen in the Google cup... A few of the features, such as the two near the tip of the pencil, did not match up properly, but instead found other areas in the second image that seemed more similar.

Click the image for a larger version.