CSE576 feature detection and matching

Abstract

In this project several interesting steps of feature dection and maching was implemented. These steps including a Harris feature descriptor, image piramid to help obtaining scale/resolution invariance, and a gradientfeature descriptor. The results were then tested on several set of benchmark images. This method was then evaluated with ROC plots on its maching results and compared with other popular methods.

Introduction

Feature detection and matching has various important applications on computer vision such as object tracking, image stitching and 3D reconstruction. Its performance, accuracy and robustness are usually the bottlenecks for computer vision systems and thus it is a topic with extensive research effort. This project is an attempt trying to engineer a good feature detector/descriptor with known knowledge.

Design highlights

	Design choice	reasoning
1	Feature detection pyramid	Increased scale and resolution invariance change
2	Exclusion of clustered features	reduced rate of false negative errors
3	5x5 differential descriptors	Increased lighting condition invariance

Design Specification

The algorithm for feature generation is summarized below:

Convert the image to grayscale
Construct a level 5 pyramid based on the grayscale image, each level of the pyramid is the previous level convolved with a 5x5 gaussian kernel, and downsized by 2.
Compute Harris value for each point in each image of the pyramid using

where w is also a 5x5 gaussian kernel. The Harris is computed with the lower eigenvalue as abs((h11 + h22 - sqrt(4*h12*h21 + (h11-h22)*(h11-h22))))
Compute local Maxima using a 3x3 window, if there are more than one maxima in one area, the pair as a whole are eliminated as they would add randomness to the features
push the maxima as features, the descriptor is just the differential values on a 5x5 window sampled around the feature point, this window, again, was convolved by the 5x5 gaussian kernel to give circular weights
Machtching is done comparing the Euclidean distance between feature sets or the ratio between the best and the second best.

Software Implementation

The algorithm was implemented in C++ on the skeleton code provided by the course. Relevant libraries were provided by Richard Szeliski in 2001. Aside from having various linker problems instantiation of template functions in my compiler, a good amount of minor details were modified in the library to ensure accurate result. The convolution routine was implemented using its definition and not optimized via Fast Fourier Transform. As most of computing were convolutions in this project, a FFT implementation of covolution could further improve algorithm performance

Results and evaluation

1.benmark test with 5x5 window descriptor, euclidean distance matching score

DataSet	Averaged AUC	Averaged error
bikes	0.763074	351.818781 pixels
graf	0.556292	294.190016 pixels
leuven	0.289110	394.819402 pixels
wall	0.379246	367.744664 pixels

2.benmark test with 5x5 window descriptor, ratio euclidean distance matching score

DataSet	Averaged AUC	Averaged error
bikes	0.758166	351.844164 pixels
graf	0.536326	294.361374 pixels
leuven	0.512425	394.666906 pixels
wall	0.637607	367.781199 pixels

3.benmark test with gradient window descriptor, euclidean distance matching score

DataSet	Averaged AUC	Averaged error
bikes	0.743449	310.657573 pixels
graf	0.537948	302.726848 pixels
leuven	0.594175	279.365569 pixels
wall	0.653070	315.959649 pixels

4.benmark test with gradient window descriptor, ratio euclidean distance matching score

(best results)

DataSet	Averaged AUC	Averaged error
bikes	0.804462	311.124159 pixels
graf	0.551508	302.692983 pixels
leuven	0.797392	279.235992 pixels
wall	0.784871	315.939953 pixels

fig.1 harris function for graf

banner logo

fig.2 harris function for yosemite

banner logo

fig.3 6 roc plots for graf

banner logo

fig.4 6 roc plots for yosemite

banner logo

Conclusion and discussion

Observing results from different testbench images, I conclude that after applying gradient window descriptor, images sets with differences in illumination saw really good performance boost. This is is because the gaussian kernel acts as a low frequency filter and the gradient filtered out the DC component of the patch(illumination change invariant). Applying the ratio SSD matching increases the results in general as it takes out a good number of false posible results. However, the algorithm in general is not rotation transform invariant so that the performance is pool images with rotations.

References

Szeliski, Richard. Computer Vision: Algorithms and Applications. London: Springer, 2011. Print.

FEATURE DETECTION AND MATCHING

COMPUTER VISION (EE576), SPRING2013

1.benmark test with 5x5 window descriptor, euclidean distance matching score

2.benmark test with 5x5 window descriptor, ratio euclidean distance matching score

3.benmark test with gradient window descriptor, euclidean distance matching score

4.benmark test with gradient window descriptor, ratio euclidean distance matching score

(best results)

fig.1 harris function for graf

fig.2 harris function for yosemite

fig.3 6 roc plots for graf

fig.4 6 roc plots for yosemite