CSE 576 Computer Vision, Spring 2005

Project 1: Feature Detection and Matching

Enrique Larios Delgado



Feature Description:

My code uses as its feature detector the Harris Corner-Edge Detector. It is based on the method described in [Harris 88].  The threshold that I set allowed that mostly the corner where selected. This because it is considered that corners are points where information accumulates. My detection algorithm starts visiting all the pixels in the image calculating the gradient magnitude and angle as well as the differences in x and y. The differences are used to compute the corner response as proposed by Harris. The pixels whose corner response is greater than the threshold and that are local maxima in its 3x3 neighborhood are selected as feature points.

The process of detection is applied to the first three levels of the pyramid of images in order to make the feature detector more scale invariant. My descriptor design is based in the scale invariant keypoints described in [Lowe 04]. The descriptor implementation starts with the computation of the dominant orientation of the region surrounding the keypoint. The gradient magnitudes in the 7x7 area around the keypoint  are sorted in a 36 bin histogram according to the angle of the gradient. The magnitudes are weighted by a Gaussian function centered in the keypoint in order to give more importance to the gradients closer to the keypoint. The bin with the highest magnitude is used as base to compute the orientation of the region.  The magnitude of the two neighboring bin is used in a linear interpolation to obtain the final value of the main orientation angle. As recommended by Lowe, if the magnitude of the second greatest bin is 80% or more of the main orientation, then my algorithm generates another keypoint with this second orientation.

With the main orientation of the neighborhood the algorithm normalizes all the gradient angles of the 7x7 patch in order to reach rotation invariance. With the new angles the assignment to the bins is recomputed. My descriptor is a 36 dimension vector that is normalized to increase its robustness to contrast change and withe bin assignment normalized to the main orientation for rotation invariance.


Design Choices:


My design choices are as follow. With the image pyramid I hoped to achieve invariance to chance of scale. The bin assignment has the main purpose of giving rotation invariance, although it also provides some resistance to affine transformation because a shift of the pixels in the neighborhood of the keypoint still contributes to the same bin.


Report:

Bikes

Simple Window My Design Provided SIFT
1 to 2
307.210302
400.089969
7.281360
1 to 3
379.670285
425.163391
13.653063
1 to 4
No Keypoint detected in 4
No Keypoint detected in 4 19.995964
1 to 5
No Keypoint detected in 5
No Keypoint detected in 5
34.570192
1 to 6
No Keypoint detected in 6
No Keypoint detected in 6
49.073032


Graf

Simple Window My Design Provided SIFT
1 to 2
280.019288
351.501793
14.593259
1 to 3
229.225710
307.363170
32.300461
1 to 4
284.752243
341.761410 140.227971
1 to 5
283.424408
318.527942
293.978437
1 to 6
330.458009
378.423513
326.708168



Leuven

Simple Window My Design Provided SIFT
1 to 2
76.069583
327.631952
9.831576
1 to 3
285.942611
345.994421
8.202653
1 to 4
289.529478
367.727299
13.443062
1 to 5
373.714004
368.069801
12.893896
1 to 6
357.200646
378.820748
18.085878




Wall

Simple Window My Design Provided SIFT
1 to 2
132.360895
345.078615
4.627598
1 to 3
192.859104
335.677877
2.744426
1 to 4
243.411028
360.679550
13.156131
1 to 5
333.306296
350.012036
53.908964
1 to 6
409.619672
375.960471
332.456770


Strength and Weakness:

It is easy to see from my results that the simple window descriptor performs better than my feature descriptor. I thinks that this happens because of the lack of  more dimension in the vector of my descriptor. Particularly I attribute the failure to match because the missing spatial location information that it is contained in Lowe's descriptor. I must point out that I tried to implement the full fledged SIFT features as described in [Lowe 04] in the light of the poor results of my descriptor but the lack of a detailed description of the SIFT's implementation also worked against its successful implementation. However, I still left the implementation commented in my source code.
Although the low score in the test could also be the result of a bug I would like to point out that I thoroughly reviewed my code, that the reason I attribute it it to the design of my descriptor.


Performance Demonstration:

Here I show the performance of my operator that although didn't reached very good performance, it still found the corner of the sweater even with the change of scale and translation.
Note: This is not a picture of myself. Right now I don't have a digital camera.





Here I show some results of my detection:





Extra Credit:

My
feature detector is run over the pyramid of images in order to achieve scale invariance. I must point out that I have booth versions in my code. I have a scale invariant version of my descriptor and the window descriptor and a simple one that only uses the oririgal image, without using the image pyramid. I did this because the code of the pyramid of images still has some glitches, it tended to failed with big images.