EE/CSE 576: Content-Based Retrieval with Regions and Relationships
Images
Download the images you need: zip, tarred gzip
Download the image thumbnails: zip, tarred gzip, PowerPoint
(The image thumbnails, in jpeg format, should help keep your
write-up a manageable size.)
In this assignment, you will develop a content-based image retrieval
system that retrieves database images
based on the similarity between their regions and those of a query
image, using both region attributes and spatial relationships.
|
|
mountain image |
segmented by color |
|
|
another mountain image |
segmented by color |
What You Should Do
The main idea is to represent each image (the database images and the
query image) by
a set of regions obtained from color clustering, their attributes, and
their simple
spatial relationships.
- For each image in the database you will perform the following
procedure:
- Run your color clustering on it to obtain a labeled cluster
image.
- Run connected components on it to obtain a labeled
segmentation image. Possibly perform some noise cleaning/merging
operations to improve the regions. (Something that
you can do the same for all of them.)
- For each major region (use a size threshold), compute at
least the following
attributes
- size
- mean color [(R,G,B) or whatever space you like]
- a few texture attributes (for example, LBP and co-occurrence are easiest)
- centroid (row, column)
- bounding box or other representation of where the region
is
- RAG (region adjacency graph). This should probably be
represented as a set of
adjacency lists, one for each node (region) in the graph.
- For each pair of (major) adjacent regions in the RAG, find
and record the following
possible relationships: inside, above_adjacency, below_adjacency,
left_adjacency,
right_adjacency, other_adjacency (if none of the others are satisfied
by whatever
requirements you impose to define them).
- Store the attributes and relationships in a data structure
that you define and
that can be both used in memory and also saved in a file so you don't
have to keep
rerunning the analysis. We will refer to this structure for an image I
as DS(I).
- Develop a distance measure that will compute the distance
RELDIST(I1,I2) between DS(I1) and DS(I2) for any two images I1 and I2.
Ideas for this distance
measure can come from Chapter 8, Chapter 11, and your own ideas.
Basically,
you will need to find the best correspondence between regions of I1 and
regions of I2 by whatever algorithm you choose. (Keep it simple; a
greedy
approach is OK for basic assignment.) Then once you have the
correspondence, you can develop the error in terms of region attributes
of
corresponding regions, missing or extra regions, and relationship
errors.
- Create a query system in which you can select a query image Q
and compare it
to each image I in your database by computing RELDIST(Q,I). Then you
order the images in the database according to their distance to the
query
and return the ordered list (the images and associated distances).
A fancy user interface is not required for the basic assignment.
- Test your system as indicated below.
- The database has 40 ppm images for the tests.
- The following images should also be query images.
- beach_2
- boat_5
- cherry_3
- crater_3
- pond_2
- stHelens_2
- sunset1_2
- sunset2_2
- Use your distance measure to compare each query image to all
40
database images, recording the distances you get for all 320 tests.
Sample Timeline
- First week: Color clustering, connected components, and
a few attributes.
- Second week: The rest of the attributes, the RAG, and the
relationships.
- Third week: Develop data structure and distance measure. Start
testing.
- Fourth week: Finish testing and write it up.
Extra Credit
Here are some ideas for improvements to the basic assignment, any of
which will earn some extra credit:
- Develop a better algorithm for finding the best correspondence
between regions of the query image and those of the database image,
using both attributes and relationships.
- Develop a nice GUI for your system.
What Your Report Should Contain
- Describe your system and give an outline of the training and the
testing
procedures
- Describe your color clustering algorithm and how you improve the
regions. Give several examples to show your color clustering results
and improved regions.
- List the attributes you selected to describe the regions.
Explain why you want to use them.
- List the relationships you used to describe the adjacent region
pairs and explain why you select them.
- Describe the definition of your distance measure and explain the
motivation.
- A section of the report that gives the results
of the tests. For each of the 8 query image, use the THUMBNAILS to
show the query and the results with their distances printed in
ascending order, ie. smallest distance first. Hopefully, the one
with smallest distance will be the query image itself, which ought
to get a zero when compared to itself, and the one with the largest
distance
will be quite different.
(Example) Query Results for boat_2
boat_2
d = 0 |
boat_4
d = 0.05 |
boat_3
d = 0.07 |
boat_5
d = 0.07 |
beach_3
d = 0.12 |
beach_2
d = 0.13 |
beach_4
d = 0.15 |
crater_2
d = 0.16 |
boat_1
d = 0.20 |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
sunset1_5
d = 0.98 |
- A concluding section of the report that discusses your results.
- An appendix containing your commented code and a readme file to
describe how to run your code
Remember that you should put headers on all your routines with the
following information:
- NAME (of you)
- DATE
- TITLE (of routine)
- PURPOSE (of routine)
- PARAMETERS (of routine)