Develop an object recognition system that uses "parts" from interest operators to recognize classes of objects. You may use any of the interest operators from the literature and the SIFT Descriptor, which can be used to describe regions identified by another interest operator as well as the SIFT detector. (URLs below)
Your system should learn its object classes from training images. For each object, there should be a set of images that contain the object and another set that does not. From this it should learn which of the detected regions are likely parts of the object. We suggest you follow the learning methodology in Yi's paper on the generative / discriminative approach. The EM algorithm is used for clustering the parts in phase 1, and then a discriminative classifier is trained to recognize the object in phase 2. In Yi's work, the discriminative classifier was trained on vectors that gave, for each image, an aggregated response to each EM component. Another related approach is a vector that is a histogram of the components found in the image. The WEKA package contains many useful classifiers. Yi used neural nets.
The system should be tested on the Caltech-4 image set (motor bikes, faces, airplanes, and cars) and any other objects you wish to include. For each object class, you should have a set of at least 200 images that include the object and an equal number that do not. In both sets, randomly select 2/3 of the images for training and leave 1/3 for testing. Your test will then show, for each object, the percentage of the test set that were correctly classified. You will report the numbers plus showing examples of correctly classified and incorrectly classified images of each class.
Sample Timeline
Extra Credit
Develop a simple model for the spatial relationships among the parts and train your system to recognize objects according to both parts and relationships.
URLs
What your report should contain