Vision Project 1: Feature Matching

Erik Andersen




Feature Descriptor

The image actually contains three bands, so I created a more advanced feature descriptor simply by using all three bands. Therefore, my descriptor became 5x5x3, with 75 data entries. This caused some significant improvements.

I also spent several hours trying to implement rotational invariance, which, unfortunately, performed worse than the simple 5x5 window. I have NOT included this component in the performance data presented below. However, I will explain my process here. In the project, we used the Harris Corner detector as an approximation for the smaller Eigenvalue, and we were not computing Eigenvalues. In order to find the general orientation of the feature, I used the larger Eigenvalue, which I computed directly:

Then, from this Eigenvalue, I constructed the corresponding Eigenvector by solving:

At this point, I rotated my window by this orientation and resampled the image to fill the 5x5 window with values. Overall, this actually caused a reduction in accuracy. I noticed by looking at the features that the same feature did not always get the same orientation in both images, even though the pixels in the vicinity appeared similar. Thus, I think that noise in the image and errors in the exact position of the feature might have caused these inaccuracies in the orientation, and these were enough to cause features that would have otherwise matched up to match incorrectly.

Design Choices

I decided to use the rest of the information in the image (the other bands) because it was already there and could help the feature matching. I credit Richa Prasad with the original idea to use all of the bands. I decided not to use my algorithm for rotating the feature window because it was ineffective.

ROC Curves

ROC Curves for Yosemite (complex is my feature descriptor)


ROC Curves for Graffiti


Harris Image for Yosemite
Harris Image for Graffiti
Average AUCs:
graf: 0.568
bikes: the benchmark program crashed, in an area that's not my code
leuven: 0.524
wall: 0.553

Strengths and Weaknesses

My program is able to match features pretty well in the case where there is only a translation (Yosemite1.jpg, Yosemite2.jpg). However, my feature descriptor is not very robust to changes in scale and illumination. I tried to make it invariant to rotation, too, but apparently my feature descriptor is still not able to deal with changes in rotation.

My own images

I tried this on some pictures I took of the quad while the cherry blossoms were blooming. The program was not able to match very well, but I realize that this is pretty hard - there are differences in scale and illumination, and the pictures are very self-similar in terms of features. Image 1


Image 2

Image 1 features

Featre matching