visionProject1

CSE 576 Project 1 Image Feature Matching 18 April 2013 Charles Delahunt

Note: the Folder 'Executable' contains the contents of 'FeaturesSkel', including compiled .o files.

Feature Descriptor

    The descriptor is a basic 5 x 5 square, with the following invariance modifications:
    For illumination, each 5 x 5 patch is divided by the value of the center pixel. The theory is that under differences of illumination, all pixels in such a small patch will be scaled the same amount (noise will be trivial).
    For rotation, first the eigenvalues of the Hessian at the feature point are calculated. The Hessian values at a given point are the sum of a 3x3 neighborhood, weighted with a Gaussian. Then the eigenvector eig associated with the largest absolute eigenvalue is chosen (and flipped if its eigenvalue was negative), and we let theta = atan2(eig).  To get feature values, each pixel location in a blank 5 x 5 patch is first rotated with a rotation matrix (sines and cosines of theta) to the point dest. dest is thus a location. It is given a value using bilinear interpolation of its neighboring pixels. This value is assigned to the pixel in the original (pre-rotation) location in the blank patch, and the values of the blank patch become the feature values. The effect of the rotation is that the (sign-adjusted) eigenvector associated with the largest eigenvalue will point horizontally to the right. The downside of the rotation is smearing due to interpolation, which over a small region (ie where all pixels are close to the origin) can be substantial.
    Miscellaneous parameters: Patch size of 5 x 5 proved better than 3 x 3, 7 x7 or 9 x 9. For some reason, doubling the values of the Gaussian weighting matrix (so the 3 x 3 weighting matrix sums to 2, not 1) yielded better results empirically. The features were selected by threshold chosen such that 0.5% of pixels would have Harris values exceeding the threshold. While 0.05% yielded very few features and gave volatile results, anything from 0.1% to 1% gave similar overall results (except for runtime). The actual number of features was less than 0.5% of total pixels since only features with locally maximal harris values were finally accepted.

Performance on Image Sets

    My feature descriptor typically did better than the basic 5 x 5 patch under SSD and about the same under Ratios. But its behavior under matching systems was erratic, in contrast to the basic 5 x 5 patch descriptor, which always did much better (+20%) when matched with Ratios. In general Ratios improved the basic patch much more than it improved my descriptor (or SSD ruined the basic patch much more than it ruined my descriptor).
    My descriptor gave noticably better results under Ratios than SSD (AUC > +10%) , except on the Graf set, where it did much worse under Ratios (due mostly to very poor performance on out-of-perspective graf6). This causes trouble for field use, since it would be unclear which matching system to use.
    By Image Set:
    Graf had rotations and some perspective shifts. The rotations suggest that my descriptor should have done much better than the plain patch. This was the case for SSD (+20%), but not for Ratios (-10%). This was a disappointment because my descriptor was supposed to be relatively resistant to rotations. Perhaps the smearing due to interpolation caused trouble with the sharp edges of the grafitti pictures, and the rotation would work better where all edges were generally softer.
    Leuven had illumination variations, and the much better performance of my descriptor, especially under SSD, indicates that the illumination invariance worked well. The large gain of Ratios over SSD for both descriptors suggests that Ratios offers some illumination invariance.
    Bikes had variations in focus. The two descriptors were equivalent (not surprising since the images did not have rotation and illumination variations). The drastic improvement under Ratios versus SSD indicates that Ratios provides some focus invariance. An interesting descriptor variant would be to apply a lowpass filter to the image first to equalize sharpness somewhat. But this would also destroy information, which is risky.
    Wall had alot of repetitive detail and also shifts in perspective. My descriptor did somewhat better (+8%) than the basic patch under SSD, and was equivalent under Ratios.
    Yosemite was a case of pure translation, so we would expect the basic patch, which has no smearing from rotations, to do better. It did slightly better under Ratios, and much better (+13%) under SSD, the only time SSD gave the basic patch an advantage.
    Overall, the added invariances of my descriptor gave a substantial benefit under SSD. However, under Ratios the rotations gave my descriptor a much smaller advantage, and in the case of many of the Graf images were actually (ironically) a drawback. The illumination invariance was a clear benefit, as seen in Leuven.

Other images: I tried to process a square image rotated exactly 90 degrees in order to isolate the effect of the rotation invariance, but the feature matching module did not work, so I was not able to get results.