CSE/EE 576: Homework Set 2
Color Clustering and Skin Finding
Download the software you need
Download the images you need: faces(zip, tarred gzip) and
scenes (zip, tarred gzip)
In this assignment, you will segment images
by color, using the K-means algorithm and some variants.
You will also try to classify the skin areas on the face images
using the generated clusters.
|
|
|
face image
| segmented by color
| skin pixels highlighted
|
What You Should Do
Part I: Color Clustering
- First implement a basic K-Means Clustering Algorithm (your own code; do not
use the MatLab K-Means) using RGB space.
- First with randomly selected cluster seeds
- Next try pre-clustering, that is you sample a small portion of the image to find the seeds
- Next with a method that you develop for selecting the seeds
intelligently from the image using its color histogram (again your
code). The seed selection should be automatic, given the histogram
and the number of seeds to be selected. One way to go is to find the peaks in the color histogram as candidates for seeds.
- Now develop and implement a smarter K-Means variant (your own code again)
that determines the best value for K by evaluating statistics of the clusters.
Some possible methods to try:
- You can start from the color histogram, as K is closely related to the number of peaks in the histogram. Not all the peaks are necessary as you want only the dominant ones, so you should pick the ones that occupies a certain portion of image in terms of pixels.
-
You can also try clustering using different Ks, and pick the best one. The metric could be related to the distance between clusters and the variance within each cluster.
- You are free to come up with your own ways.
- Test each variant of the above on both the face images
and the scene images and report your results.
Part II: Skin Classification
The goal of this part is to develop a very simple
skin detector from the results of Part I. For this part of the homework, we recommend that you work in normalized RGB space.
The common RGB representation of color images is not suitable for characterizing skin-color. In the RGB space, the triple component (r, g, b) represents not only color, but also luminance. Luminance may vary across a person's face due to the ambient lighting and is not a reliable measure in separating skin from non-skin regions. Luminance can be removed from the color representation in the normalized RGB space or chromatic color space. Chromatic colors,
also known as "pure" colors in the absence of luminance, are defined by the simple normalization process shown below:
r = R/(R+G+B)
b = B/(R+G+B)
Note : Color green is redundant after the normalization because r+g+b = 1.
- Start with the face training image set. You should use half of the face images as training, and leave half of them as testing together with the scene images. Note that you should include skins of all ethnicity in your training set for better performance.
- Run your K-means algorithm on the face training set
to get K clusters with small K, ie K < 9.
- Examine the clusters in a color-space histogram (can be by hand or automatic)
and come up with a characterization for skin pixels. That is,
create a classifier (which can be as simple as an if-then-else statement)
that classifies pixels as "skin" or "not skin" based on the color values. One way to go is to model the skin color distribution as a Gaussian.
With this Gaussian-fitted skin color model, you can now obtain the likelihood of skin for any pixel of an image.
- Run your skin finder on the face test image set.
- Report on its performance.
Things to keep in mind:
- In addition to RGB color space, you may want to try HSV (HSI) color space, or any other color space
that strikes your fancy.
- Your k-means code should output a grayscale image where each pixel's value is the number of
the cluster to which it has been assigned. The provided autocolor function will transform this output
into something more easily interpretable.
What You Should Turn In
Remember that you should put headers on all your routines with the following information:
- NAME (of you)
- DATE
- TITLE (of routine)
- PURPOSE (of routine)
- PARAMETERS (of routine)
You should also report the performance of your smart K-means algorith as well as the skin classification algorithm by providing examples. Also include
some discussions on failure examples or limitations for your approach;
this will shed light on future improvements.
If you choose to do your homework in Matlab, or C++ on linux, please turn in your homework to Colin (kzheng@cs).
If you choose to do your homework in C++ on Windows, please turn in your homework to Yi (yi@cs).
Your homework turn-in can be in Word document, pdf document or webpages.
Homework is due on April 30th, 5pm. Please plan your work early.