| CSE 473: Introduction to Artificial Intelligence
Winter 2022
|
HW ASSIGNMENT 4: due Friday, February 25 at 11:59 pm (25 pts)
In this assignment, you will develop a content-based image retrieval system that retrieves database images based on their similarity with the query image.
Requirements:
- Download the starter code (main.py) and database here. The 'images' folder contains the database images. There are a total of 40 JPEG images belonging to 8 classes (5 images per class). Each image is of size 64 x 48 pixels. If your images folder did not download properly using the first link, it is also available here. Use it to replace the images folder in the starter code folder.
- For this assignment, you need to write a report in Latex. Create an account on Overleaf if you don't have one, then click on this link to get the starter file for the report. On the Overleaf homepage where your projects are listed, create a personal copy of this project by clicking 'copy' under Actions and edit the copy. Check this tutorial if you are new to Overleaf and/or Latex.
- Read the entire instructions on this page, starter code and text in the report carefully before starting. In the code file, you need to fill in the WRITE YOUR IMPLEMENTATION HERE portions (two functions) only. You can but shouldn't need to change the existing code in those two functions. In the report file, write your solutions in the designated positions.
- Run $pip install numpy scipy matplotlib pillow imageio before running the code to ensure you have all the necessary python packages. The instructions to run the code are written in the main.py file.
- The main idea is to represent each image (the database images and the query image) by a feature vector and then use a suitable distance measure to compute the distances between the query and database images and accordingly retrieve the images most similar to the query image.
- In python, an image with height H, width W and number of channels C (3 for RGB images) is represented by a 3D matrix of shape H x W x C. To access a particular pixel (h,w) of a particular channel (c) of the 'image' variable, use image[h,w,c].
- To compute the feature vector, you need to compute two histograms (see code to know how to do it):
- Color histogram: You need to compute three types of color histogram
- Convert the image into grayscale and compute a histogram with 8 bins (2 points)
- Convert the image into grayscale and compute a histogram with 256 bins (2 points)
- Compute an RGB histogram using the original image with the number of bins that performs better between the above two (3 points)
- LBP (Local Binary Pattern) histogram: You need to compute two types of LBP histograms. Note that you need to first convert the RGB image into grayscale (given in the starter code)
- Compute the histogram from the entire image. Ignore the boundary pixels. (3 points)
- Divide the image into 16x16 regions, compute a 32-bin histogram for each region, and concatenate the histograms to get the final histogram. (3 points)
- The default distance measure in the starter code is Euclidean distance. The code to compute the distances between image feature vectors, sort them and use them to retrieve images is already provided in the starter code. To use a different distance measure, change the distance name as input argument. Check the report to see which distance measures you need and what to do with different distance measures. The available names can be found here.
- Fill in the tables, images and text in the report file. A sample output image is currently used as a placeholder.
- Please write appropriate comments in your code.
Note:
- Don't use OpenCV, PIL or other python packages to compute the features. You shouldn't need any package other than the ones already imported in the starter code.
- Ideally, your query image should be the first retrieved image in the results and all the five images in the database belonging to the same class as the query image should be in the top 5 retrieved images. Most likely, this won't happen. Precision is calculated as the ratio of the number of correct images retrieved to the total number of images retrieved. A precision of >75% can be considered to be good accuracy.
- The starter code saves the result for every combination of input arguments for you to analyze and make concrete conclusions. Sometimes you will find the same precision value (say 80%), but the 4 correct images out of the first 5 are at positions 1,2,4,5 instead of the ideal case of 1,2,3,4. Also, ideally similar images should be grouped together in the retrieval. Use these to make conclusions when the precision values are same.
Turn In: See here for submission instructions.
You should turn in the following:
- your well-commented code (main.py file)
- your report in PDF format (main.pdf file)
Please use the file names specified above. Include your name in comments at the top of your code file.
Please submit your code via the Canvas assignment and submit your report in Gradescope.
Evaluation:
- Correct implementation of color histogram (2+2+3=7 pts)
- Correct implementation of lbp histogram (3+3=6 pts)
- All completed portions and appropriate conclusions in the report (10 pts)
- Readable and well-commented code, and correct submission format on Canvas (2 pts)
LATE POLICY: Programs may be turned in until Sunday night, February 27 at 11:59pm. 10% off for each day late.
CHEATING POLICY: All work on this assignment must be your own. You may discuss the assignment with other members of the class, but your code should not look like anyone else's code in this class, any other class or on the web. Having the same code with different names for the variables doesn't work either. We check. Thanks and good luck.