EE/CSE 576: Image Understanding

Spring 2001

EE/CSE 576: Lab Exercise 2
Clustering for Image Segmentation

Due Date: April 27

Download the software you may need. (Code for reading and writing PPM images can be found in imgformat.h and imgformat.c in the canedg directory.)
Download the test images.

In this assignment, you will segment images into regions of similarly-colored pixels, using the K-means algorithm discussed in class and given in Chapter 10.


original image	false-colored labeling after k-means

Your K-means implementation should take a color PPM image, such as the one shown above on the left, and a number of clusters K. It should output a grayscale image with each pixel labeled by its class. (You can pass this image to autocolor to produce a false-colored image such as the one above.)

You can also pass this output image to conrgn to separate out the connected components within each class. Example output looks like:

false-color labeling after connected components

You should use the input images P1010021, P7180319, Ssgp1908, fb06931, fb11117 and fb23469, that are available from the image data page. For each input image, turn in printouts (preferably color) of

the original image
the false-colored labeling
the connected-components output

Also, note how many clusters you divided the pixels into (the value of K) and the number of connected components produced.

This technique will likely produce too many connected components. If you want to get rid of the small ones, you will need to construct a table of their areas (counts of pixels) and make a final pass over the image to change tiny components to label 0 (which means background).

Finally, you should turn in a listing of your K-means implementation including the headers as in Assignment 1 and internal comments to clarify the code.

Extra Credit:

K-means suffers from the fixed K problem. Implement Meng-Hee's (or any other) variant procedure and compare the results to the default K-means results. For the comparison, set K higher than the number of clusters the variant algorithm produced.
K-means is also sensitive to the initial choice of the clusters. The simplest initial setup is to choose the cluster centroids randomly (what you've probably implemented). There are some possible variations for the initial setup:
1. Uniform: The initial cluster centroids are selected uniformly along the line defined by [0 0 0] and [1 1 1] in the RGB space or uniformly in the RGB cube.
2. Randomly from data: The initial cluster centroids are selected to be the RGB values of K randomly selected pixels in the image.
3. Splitting: The initial cluster centroids are selected as follows:
  - Compute centroid of the whole data.
  - Perturb the centroid (add small random numbers to its values) and obtain a new centroid. Now we have two centroids.
  - Run K-means with K = 2 and then perturb the centroids again to obtain a total of four centroids.
  - Run K-means with these centroids and continue splitting some of the resulting centroids by perturbing until we have enough number of centroids. These centroids will be the initial setup for the final K-means.
To earn this credit, compare the effects of different initialization methods using the noisy image shown below. Run K-means using different initial settings and report the squared error (of K-means) and the misclassification rate for each case. The groundtruth contains nine regions with boundaries at 90 and 170 along both x any y axes. Therefore, the misclassification rate is the number of pixels that were not correctly assigned to the corresponding nine regions in the groundtruth image.

groundtruth noisy

Interesting Sites (in Java Applet):

Color cube visualization: It helps you to understand the color cube and color histogram used in content-based image retrieval. Note that it doesn't seem to work for your image as they claim.
ZOMAX: You can play with many algorithms taught in the class on-line.

Last updated on .


groundtruth	noisy

EE/CSE 576: Image Understanding

Spring 2001

EE/CSE 576: Lab Exercise 2 Clustering for Image Segmentation

Due Date: April 27

EE/CSE 576: Lab Exercise 2
Clustering for Image Segmentation