EE/CSE 576: Image Understanding
CSE 576: Lab Exercise 2
Clustering for Image Segmentation
Due Date: April 30
Download the software you may need.(the same as HW#1)
Download the images you DEFINITELY need.
In this assignment, you will segment images into regions of
similarly-colored pixels, using the K-means algorithm discussed in
Your K-means implementation should take a color PPM image, such as the
one shown above on the left, and a number of clusters K. It should
output a grayscale image with each pixel labeled by its class. (You
can pass this image to autocolor to produce a false-colored
image such as the one above.)
You can also pass this output image to conrgn to separate out
the connected components within each class. Example output looks like:
|false-color labeled image
You should use the input images fb06931, fb06932,
fb06933, fb11117, fb11122, fb23460,
and fb23469, available from the image data
page. For each input image, turn in printouts (preferably color)
Also, note how many classes you divided the pixels into (the value of
K) and the number of connected components produced.
- the original image
- the false-colored labeling
- the connected-components output
This technique will likely produce too many connected components. How
could you apply techniques covered in class to clean up the connected
components results? Answer this question in your submission. You
don't need to implement your answer, just describe a procedure and
argue that it would work.
You should also turn in a listing of your K-means implementation.
Color clustering is a kind of vector quantization and could be useful
for image compression. For example, the following image only uses 7 colors
to approximate fb06931. To earn this credit, approximate fb11117 and
fb23469 by only using K colors.
|7-color version of fb06931
- (Harder, it could lead into additional work in this area.)
The K-means algorithm is sensitive to the initial clustering setup. The simplest
initial setup is to assign them randomly (what you've probably implemented).
There are some other variations for the initial setup:
1. Uniform linear: The initial cluster means are selected uniformly along
the line defined by [0 0 0] and [1 1 1] in the RGB cube.
2. Uniform cube: The initial cluster means are selected uniformily throughout
the RGB cube defined by [0 0 0] and [1 1 1]. This only works for
segmentations where the cube root of the number of clusters is a positive
integer greater than one. For example if the number of clusters is eight
then the initial cluster means will be [0 0 0], [0 1 0], [1 1 0], [1 0 0],
[0 0 1], [0 1 1], [1 1 1], and [1 0 1]. If the number of clusters is nine,
[0.5 0.5 0.5] could be used.
3. Statistical: The initial cluster means are selected uniformly along
the line defined by mean of the samples minus one standard deviation and
the mean of the samples plus one standard deviation, [mean-sigma mean+sigma].
To earn this credit, to compare the random approach to some or all of the other three
according to the following variables: weighted average cluster variance, number of
iterations and missclassification rate when applying to the noisy version, noisy,
of the synthetic image,
gtruth (K=9). (The division of regions are coordinate 90 and 170
along x-axis and y-axis.) List results and statistics of these three measures
for the chosen approaches.
Interesting Sites (in Java Applet):
- Color cube visualization:
It helps you to understand the color cube and color histogram used in content-based
image retrieval. Note that it doesn't seem to work for your image as they claim.
- ZOMAX: You can play with
many algorithms taught in the class on line.