CSE455: Homework 2
Color Clustering and Skin Finding
Download the software you need: hw2.zip
Download the images you need: faces.zip and
scenes.zip
In this assignment, you will segment images
by color, using the K-means algorithm and some variants.
You will also try to classify the skin areas on the face images
using the generated clusters.
|
|
|
face image
| segmented by color
| skin pixels highlighted
|
What You Should Do
Part I: Color Clustering
- First implement a basic K-Means Clustering Algorithm.
Use normalized (r,g) space where r = R / (R + B + G) and g = G / (R + B + G).
- First with randomly selected cluster seeds
- Next try sampling pixels from the image to find the seeds. Choose a pixel
and make its value a seed if it is sufficiently different from already-selected
seeds. Repeat till you get K different seeds.
- Next with a method that you develop for selecting the seeds
intelligently from the image using its color histogram (again your
code). The seed selection should be automatic, given the histogram
and the number of seeds to be selected. One way to go is to find the peaks in the color histogram as candidates for seeds.
- Now develop and implement a smarter K-Means variant (your own code again)
that determines the best value for K by evaluating statistics of the clusters.
Some possible methods to try:
- You can start from the color histogram, as K is closely related to the
number of peaks in the histogram. Not all the peaks are necessary as you want
only the dominant ones.
-
You can also try clustering using different Ks, and pick the best one. The metric could be related to the distance between clusters and the variance within each cluster.
- You are free to come up with your own ways.
- Test each variant of the above on both the face images
and the scene images and report your results.
Part II: Skin Classification
The goal of this part is to develop a very simple
skin detector from the results of Part I, sticking with normalized (r,g) space.
- Start with the face training image set. You should use half of the face
images as training, and
leave half of them for testing. Also use the scene images in your tests to see
if your program thinks it finds skin in them.
Note that you should include skins of all races in your training set
for better performance.
- Try several values of K and choose the best (by hand) by how useful
the clusters are for step 3.
- Examine the clusters in a color-space histogram (can be by hand or automatic)
and come up with a characterization for skin pixels. That is,
create a classifier (which can be as simple as an if-then-else statement)
that classifies pixels (again your code) as "skin" or "not skin" based on the color values.
- Run your skin finder on the whole face image set: both the training
and testing images.
- Report on its performance on both sets.
Things to keep in mind:
- Your K-means code should output the following results:
- a grayscale image where each pixel's value is the number of the
cluster to which it has been assigned, which the provided autocolor
function will transform into something more easily interpretable.
Do not forget to create psudocolor images using the autocolor function
whenever you generate grayscale images from your K-means procedure
for the grader to view the results of your code.
- a color image in the ppm format, where each pixel has the mean
color of the cluster it was assigned to (this generally makes a
prettier picture, but it can be harder to tell the number of
clusters).
Turn in:
- All of your code that is created for the above Part I and II.
Your code must be well commented and in the ASCII format so that
the grader can compile them to working binaries.
- Write a brief report on the performance of your smart K-means algorithm
as well as the skin classification algorithm, and provide examples (a few
best, worst and average results will be fine).
Your report must clearly describe and explain the algorithms you developed.
Also include
some discussions on failure examples or limitations for your approach.
This part can be
either in MS Word or pdf format.
NOTE: Submission in the html format (web pages) is no longer
allowed.
- Your report must include images that are output from your
algorithms for the purpose of clear exposition. The more the better.
Please email your homework to Masa (mkbsh@cs).
Homework is due on Feb 1 (Thu) by 11:59 PM. Please plan your work early,
much earlier than homework 1, as it takes MUCH longer to do.