CSE 576
Project 1 Image Feature
Matching
18 April
2013 Charles Delahunt
Note: the Folder 'Executable' contains the contents of 'FeaturesSkel', including compiled .o files.
Feature Descriptor
The descriptor is a basic 5 x 5 square, with the following invariance modifications:
For illumination, each 5 x 5 patch is divided by the value of the
center pixel. The theory is that under differences of illumination, all
pixels in such a small patch will be scaled the same amount (noise will
be trivial).
For rotation, first the eigenvalues of the Hessian at the feature point
are calculated. The Hessian values at a given point are the sum of a
3x3 neighborhood, weighted with a Gaussian. Then the eigenvector eig associated with the largest absolute eigenvalue is chosen (and flipped if its eigenvalue was negative), and we let theta = atan2(eig). To get feature values, each pixel location in a blank 5 x 5 patch is first rotated with a rotation matrix (sines and cosines of theta) to the point dest. dest is
thus a location. It is given a value using bilinear interpolation of
its neighboring pixels. This value is assigned to the pixel in the
original (pre-rotation) location in the blank patch, and the values of
the blank patch become the feature values. The effect of the rotation
is that the (sign-adjusted) eigenvector associated with the largest
eigenvalue will point horizontally to the right. The downside of the
rotation is smearing due to interpolation, which over a small region
(ie where all pixels are close to the origin) can be substantial.
Miscellaneous parameters: Patch size of 5 x 5 proved better than 3 x 3,
7 x7 or 9 x 9. For some reason, doubling the values of the Gaussian
weighting matrix (so the 3 x 3 weighting matrix sums to 2, not 1)
yielded better results empirically. The features were selected by
threshold chosen such that 0.5% of pixels would have Harris values
exceeding the threshold. While 0.05% yielded very few features and gave
volatile results, anything from 0.1% to 1% gave similar overall results
(except for runtime). The actual number of features was less than 0.5%
of total pixels since only features with locally maximal harris values
were finally accepted.
Performance on Image Sets
My feature descriptor typically did better than the basic 5 x 5 patch
under SSD and about the same under Ratios. But its behavior under
matching systems was erratic, in contrast to the basic 5 x 5 patch
descriptor, which always did much better (+20%) when matched with
Ratios. In general Ratios improved the basic patch much more than it
improved my descriptor (or SSD ruined the basic patch much more than it
ruined my descriptor).
My descriptor gave noticably better results under Ratios than SSD (AUC
> +10%) , except on the Graf set, where it did much worse under
Ratios (due mostly to very poor performance on out-of-perspective
graf6). This causes trouble for field use, since it would be unclear
which matching system to use.
By Image Set:
Graf had rotations and some perspective shifts. The rotations suggest
that my descriptor should have done much better than the plain patch.
This was the case for SSD (+20%), but not for Ratios (-10%). This was a
disappointment because my descriptor was supposed to be relatively
resistant to rotations. Perhaps the smearing due to interpolation
caused trouble with the sharp edges of the grafitti pictures, and the
rotation would work better where all edges were generally softer.
Leuven had illumination variations, and the much better performance of
my descriptor, especially under SSD, indicates that the illumination
invariance worked well. The large gain of Ratios over SSD for both
descriptors suggests that Ratios offers some illumination invariance.
Bikes had variations in focus. The two descriptors were equivalent (not
surprising since the images did not have rotation and illumination
variations). The drastic improvement under Ratios versus SSD indicates
that Ratios provides some focus invariance. An interesting descriptor
variant would be to apply a lowpass filter to the image first to
equalize sharpness somewhat. But this would also destroy information,
which is risky.
Wall had alot of repetitive detail and also shifts in perspective. My
descriptor did somewhat better (+8%) than the basic patch under SSD,
and was equivalent under Ratios.
Yosemite was a case of pure translation, so we would expect the basic
patch, which has no smearing from rotations, to do better. It did
slightly better under Ratios, and much better (+13%) under SSD, the
only time SSD gave the basic patch an advantage.
Overall, the added invariances of my descriptor gave a substantial
benefit under SSD. However, under Ratios the rotations gave my
descriptor a much smaller advantage, and in the case of many of the
Graf images were actually (ironically) a drawback. The illumination
invariance was a clear benefit, as seen in Leuven.
Other images: I tried to process a square image rotated exactly 90
degrees in order to isolate the effect of the rotation invariance, but
the feature matching module did not work, so I was not able to get
results.
ROC Plots
Area Under Curve Values for ROC plots (as % of total area = 1)
Graf 1, 2
|
50.2
|
71.3
|
61.6
|
65.9
|
Yosemite 1,2
|
87.2
|
90.9
|
74.5
|
87.7
|
Benchmark Sets: Average AUCs (as %) of Various Features and Matchings
|
Plain Patch
|
My Descriptor
|
|
SSD
|
Ratios
|
SSD
|
Ratios
|
graf |
48.8
|
62.1
|
66.4
|
52.4
|
bikes |
28
|
59.9
|
27.4
|
57.1
|
leuven |
39*
|
62*
|
63.5
|
72
|
wall |
42.3
|
68.9
|
50.3
|
71.3
|
* average of values for sets 2 - 5. Set 6 gave NAN
Harris Feature Image, graf6