CSE 455: Homework Set 4
Content-Based Image Retrieval
For this assignment, you may work in teams of up to two
people.
However, there will be more requirements for a two person
team.
Materials
Download the set of images.
Download the set of image thumbnails. (The
image thumbnails, in jpeg format, should help keep your
write-up a manageable size.)
Note that there is no provided code for this assignment. You can start
with your k-means code from the previous assignment if you like.
Contents
In this assignment, you will develop a
content-based image retrieval system that retrieves database images
based on the similarity between their regions and those of a query
image.
|
|
mountain image
| segmented by color
|
|
|
another mountain image
| segmented by color
|
What To Do
The main idea is to represent each image (the database images and the query image) by
a set of regions obtained from color clustering, their attributes, and
(for two person teams) their simple
spatial relationships. Then, come up with a distance function between two
images in this representation, and use this distance function to find the
images most similar to a query image.
- For each image in the database you will perform the following procedure:
- Run your color clustering on it to obtain a labeled cluster image.
- Run connected components on it to obtain a labeled segmentation image.
Possibly perform some noise cleaning/merging operations to
improve the regions. Don't vary any parameters between images.
- For each major region (use a size threshold), compute at least the following
attributes:
- size
- mean color, in RGB or whatever space you like
- at least the following co-occurrence texture features (use spatial
relationship d = (1,1)):
- energy
- entropy
- contrast
- centroid (row, column)
- bounding box or other representation of where the region is
- Two person teams: RAG (region adjacency graph). This should
probably be represented as
a set of adjacency lists, one for each node (region) in the graph.
- Two person teams: For each pair of adjacent regions in the
RAG, find and record the following
possible relationships: inside, above_adjacency, below_adjacency, left_adjacency,
right_adjacency, other_adjacency (if none of the others are satisfied by whatever
requirements you impose to define them).
- Store the attributes and relationships in a data structure that you define and
that can be both used in memory and also saved in a file so you don't have to keep
rerunning the analysis. We'll refer to the data structure for image
I as DS(I)
- Develop a distance measure that will compute the distance
RELDIST(I1,I2)
between DS(I1) and DS(I2) for any two
images I1 and
I2. To do this, you need to find a correspondence
between the regions of I1 and the regions of
I2.
One person teams: Compute this correspondence greedily. That
is,
for each region in I1, find its closest match in
I2, and so on.
Two person teams: Compute the optimal correspondence. You can do
this
with an exponential search procedure, since the number of regions will be
small.
Finding correspondences is in Chapter 11 and was covered
in the 2D Object Recognition lecture.
Once you have the correspondence, the distance between
I1 and I2 should be some function
of:
- difference in attributes of corresponding regions
- difference in number of regions
- (Two person teams) difference in region relationships
- Create a query system in which you can select a query image Q
and compare it
to each image I in your database by computing RELDIST(Q,I).
Then you order the images in the database according to their distance to the query
and return the ordered list (the images and associated distances).
Extra credit: Create a simple GUI for performing these
queries.
- Test your system as indicated below.
- The database should consist of the 40 provided images.
- The following images should be used as queries:
- beach_2
- boat_5
- cherry_3
- crater_3
- pond_2
- stHelens_2
- sunset1_2
- sunset2_2
- Use your distance measure to compare each query image to all 40
database images, recording the distances you get for all 320 tests.
Sample Two Person Timeline
- First week: Color clustering, connected components, and
region attributes.
- Second week: Develop RAG data structure and distance measure.
- Third week: Perform testing and write it up.
Sample One Person Timeline
- First week: Color clustering, connected components, and
some region attributes.
- Second week: Finish region attributes, develop distance measure and
begin testing.
- Third week: Finish testing and write it up.
Turnin
Your turnin should describe all aspects of your system, and show the
required results. You should also submit all of your code.
- Describe your color clustering and region-finding algorithm.
- List the region attributes you used.
- List the region relationships you used and explain how you determine
the relationship between two regions (two person teams).
- Describe your distance measure, including the region correspondence
algorithm.
- Use the provided thumbnail images to show the results for each of the
required queries. Also include the distances between the query image and
each result image. Print the images in order of ascending distance.
Hopefully, the query image will have distance zero to itself, and similar
images will have smaller distances than dissimilar ones.
Example Query Results for boat_2
 boat_2 d = 0
|
 boat_4 d = 0.05
|
 boat_3 d = 0.07
|
 boat_5 d = 0.07
|
 beach_3 d = 0.12
|
 beach_2 d = 0.13
|
 beach_4 d = 0.15
|
 crater_2 d = 0.16
|
 boat_1 d = 0.20
|
...
|
... |
... |
... |
... |
... |
...
|
...
|
...
|
...
|
 sunset1_5 d = 0.98
|
- Write a short discussion of your results.
- Submit all the source code in a compilable state as well as the
executables. Your source code must be well structured and well commented.
This assignment is due on March 13, by 11:59 PM.
Please submit all of the above items to Masa by email.