EE/CSE 576 Spring 2004: Course Project: Due June 5, 5pm
Content-Based Retrieval with Regions and Relationships
Images
Download the images you need: zip, tarred gzip
Download the image thumbnails: zip, tarred gzip, PowerPoint
(The image thumbnails, in jpeg format, should help keep your write-up a manageable size.)
In this assignment, you will develop a
content-based image retrieval system that retrieves database images
based on the similarity between their regions and those of a query image,
using both region attributes and spatial relationships.
|
|
mountain image
| segmented by color
|
|
|
another mountain image
| segmented by color
|
What You Should Do
The main idea is to represent each image (the database images and the query image) by
a set of regions obtained from color clustering, their attributes, and their simple
spatial relationships.
- For each image in the database you will perform the following procedure:
- Run your color clustering on it to obtain a labeled cluster image.
- Run connected components on it to obtain a labeled segmentation image.
Possibly perform some noise cleaning/merging operations to
improve the regions. (Something that
you can do the same for all of them.)
- For each major region (use a size threshold), compute at least the following
attributes
- size
- mean color [(R,G,B) or whatever space you like]
- a few texture attributes (for example, those from the midterm are easiest)
- centroid (row, column)
- bounding box or other representation of where the region is
- RAG (region adjacency graph). This should probably be represented as a set of
adjacency lists, one for each node (region) in the graph.
- For each pair of (major) adjacent regions in the RAG, find and record the following
possible relationships: inside, above_adjacency, below_adjacency, left_adjacency,
right_adjacency, other_adjacency (if none of the others are satisfied by whatever
requirements you impose to define them).
- Store the attributes and relationships in a data structure that you define and
that can be both used in memory and also saved in a file so you don't have to keep
rerunning the analysis. We will refer to this structure for an image I as DS(I).
- Develop a distance measure that will compute the distance RELDIST(I1,I2)
between DS(I1) and DS(I2) for any two images I1 and I2. Ideas for this distance
measure can come from Chapter 8, Chapter 11, and your own ideas. Basically,
you will need to find the best correspondence between regions of I1 and
regions of I2 by whatever algorithm you choose. (Keep it simple; a greedy
approach is OK for basic assignment.) Then once you have the
correspondence, you can develop the error in terms of region attributes of
corresponding regions,
missing or extra regions, and relationship errors.
- Create a query system in which you can select a query image Q and compare it
to each image I in your database by computing RELDIST(Q,I).
Then you order the images in the database according to their distance to the query
and return the ordered list (the images and associated distances).
A fancy user interface is not required for the basic assignment.
- Test your system as indicated below.
- The database has 40 ppm images for the tests.
- The following images should also be query images.
- beach_2
- boat_5
- cherry_3
- crater_3
- pond_2
- stHelens_2
- sunset1_2
- sunset2_2
- Use your distance measure to compare each query image to all 40
database images, recording the distances you get for all 320 tests.
Sample Timeline
- First week: Color clustering, connected components, and
a few attributes.
- Second week: The rest of the attributes, the RAG, and the relationships.
- Third week: Develop data structure and distance measure. Start testing.
- Fourth week: Finish testing and write it up.
Extra Credit
Here are some ideas for improvements to the basic assignment, any of which
will earn some extra credit:
- Develop a better algorithm for finding the best correspondence
between regions of the query image and those of the database image,
using both attributes and relationships.
- Develop a nice GUI for your system.
What Your Report Should Contain
- Describe your system and give an outline of the training and the testing
procedures
- Describe your color clustering algorithm and how you improve the regions.
Give several examples to show your color clustering results and improved regions.
- List the attributes you selected to describe the regions. Explain
why you want to use them.
- List the relationships you used to describe the adjacent region pairs and
explain why you select them.
- Describe the definition of your distance measure and explain the motivation.
- A section of the report that gives the results
of the tests. For each of the 8 query image, use the THUMBNAILS to
show the query and the results with their distances printed in ascending order,
ie. smallest distance first. Hopefully, the one
with smallest distance will be the query image itself, which ought
to get a zero when compared to itself, and the one with the largest distance
will be quite different.
(Example) Query Results for boat_2
boat_2 d = 0
|
boat_4 d = 0.05
|
boat_3 d = 0.07
|
boat_5 d = 0.07
|
beach_3 d = 0.12
|
beach_2 d = 0.13
|
beach_4 d = 0.15
|
crater_2 d = 0.16
|
boat_1 d = 0.20
|
...
|
... |
... |
... |
... |
... |
...
|
...
|
...
|
...
|
sunset1_5 d = 0.98
|
- A concluding section of the report that discusses your results.
- An appendix containing your commented code and a readme file to describe how to run your code
Remember that you should put headers on all your routines with the following information:
- NAME (of you)
- DATE
- TITLE (of routine)
- PURPOSE (of routine)
- PARAMETERS (of routine)