 EE/CSE 576: Image Understanding
Spring 1999

CSE 576: Lab Exercise 2
Clustering for Image Segmentation
Due Date: April 30
Download the software you may need.(the same as HW#1)
Download the images you DEFINITELY need.
In this assignment, you will segment images into regions of
similarlycolored pixels, using the Kmeans algorithm discussed in
class.


original image
 falsecolored labeling

Your Kmeans implementation should take a color PPM image, such as the
one shown above on the left, and a number of clusters K. It should
output a grayscale image with each pixel labeled by its class. (You
can pass this image to autocolor to produce a falsecolored
image such as the one above.)
You can also pass this output image to conrgn to separate out
the connected components within each class. Example output looks like:

falsecolor labeled image

You should use the input images fb06931, fb06932,
fb06933, fb11117, fb11122, fb23460,
and fb23469, available from the image data
page. For each input image, turn in printouts (preferably color)
of
 the original image
 the falsecolored labeling
 the connectedcomponents output
Also, note how many classes you divided the pixels into (the value of
K) and the number of connected components produced.
This technique will likely produce too many connected components. How
could you apply techniques covered in class to clean up the connected
components results? Answer this question in your submission. You
don't need to implement your answer, just describe a procedure and
argue that it would work.
You should also turn in a listing of your Kmeans implementation.
Extra Credit:
 (Easy)
Color clustering is a kind of vector quantization and could be useful
for image compression. For example, the following image only uses 7 colors
to approximate fb06931. To earn this credit, approximate fb11117 and
fb23469 by only using K colors.

7color version of fb06931

 (Harder, it could lead into additional work in this area.)
The Kmeans algorithm is sensitive to the initial clustering setup. The simplest
initial setup is to assign them randomly (what you've probably implemented).
There are some other variations for the initial setup:
1. Uniform linear: The initial cluster means are selected uniformly along
the line defined by [0 0 0] and [1 1 1] in the RGB cube.
2. Uniform cube: The initial cluster means are selected uniformily throughout
the RGB cube defined by [0 0 0] and [1 1 1]. This only works for
segmentations where the cube root of the number of clusters is a positive
integer greater than one. For example if the number of clusters is eight
then the initial cluster means will be [0 0 0], [0 1 0], [1 1 0], [1 0 0],
[0 0 1], [0 1 1], [1 1 1], and [1 0 1]. If the number of clusters is nine,
[0.5 0.5 0.5] could be used.
3. Statistical: The initial cluster means are selected uniformly along
the line defined by mean of the samples minus one standard deviation and
the mean of the samples plus one standard deviation, [meansigma mean+sigma].
To earn this credit, to compare the random approach to some or all of the other three
according to the following variables: weighted average cluster variance, number of
iterations and missclassification rate when applying to the noisy version, noisy,
of the synthetic image,
gtruth (K=9). (The division of regions are coordinate 90 and 170
along xaxis and yaxis.) List results and statistics of these three measures
for the chosen approaches.


gtruth
 noisy

Interesting Sites (in Java Applet):
 Color cube visualization:
It helps you to understand the color cube and color histogram used in contentbased
image retrieval. Note that it doesn't seem to work for your image as they claim.
 ZOMAX: You can play with
many algorithms taught in the class on line.
Last updated:

04/18/99 at 10PM.
