CSE/EE 576: Project 1
Download the software you need
Color Clustering and Skin Finding
Download the images you need: faces.zip and
In this assignment, you will segment images
by color, using the K-means algorithm and some variants.
You will also try to classify the skin areas on the face images
using the generated clusters.
||segmented by color
||skin pixels highlighted
What You Should Do
Part I: Color Clustering
- First implement a basic K-Means Clustering Algorithm (your own code; do not
use the MatLab K-Means). It will probably be easiest to start off using RGB space.
- First with randomly selected cluster seeds
- Next try pre-clustering, that is you sample a small portion of the image to find the seeds
- Next with a method that you develop for selecting the seeds
intelligently from the image using its color histogram (again your
code). The seed selection should be automatic, given the histogram
and the number of seeds to be selected. One way to go is to find the peaks in the color histogram as candidates for seeds.
- Now develop and implement a smarter K-Means variant (your own code again)
that determines the best value for K by evaluating statistics of the clusters.
Some possible methods to try:
- You can start from the color histogram, as K is closely related to the number of peaks in the histogram. Not all the peaks are necessary as you want only the dominant ones, so you should pick the ones that occupies a certain portion of image in terms of pixels.
You can also try clustering using different Ks, and pick the best one. The metric could be related to the distance between clusters and the variance within each cluster.
- You are free to come up with your own ways.
- Test each variant of the above on both the face images
and the scene images and report your results.
Part II: Skin Classification
The goal of this part is to develop a very simple
skin detector from the results of Part I. For this part of the homework, we recommend that you work in normalized RGB space.
The common RGB representation of color images is not suitable for characterizing skin-color. In the RGB space, the triple component (r, g, b) represents not only color, but also luminance. Luminance may vary across a person's face due to the ambient lighting and is not a reliable measure in separating skin from non-skin regions. Luminance can be removed from the color representation in the normalized RGB space or chromatic color space. Chromatic colors,
also known as "pure" colors in the absence of luminance, are defined by the simple normalization process shown below:
r = R/(R+G+B)
b = B/(R+G+B)
Note : Color green is redundant after the normalization because r+g+b = 1.
- Start with the face training image set. You should use half of the face images as training, and leave half of them as testing together with the scene images. Note that you should include skins of all races in your training set for better performance.
- Run your K-means algorithm on the face training set
to get K clusters with small K, ie K < 9.
- Examine the clusters in a color-space histogram (can be by hand or automatic)
and come up with a characterization for skin pixels. That is,
create a classifier (which can be as simple as an if-then-else statement)
that classifies pixels as "skin" or "not skin" based on the color values. One way to go is to model the skin color distribution as a Gaussian.
With this Gaussian-fitted skin color model, you can now obtain the likelihood of skin for any pixel of an image.
- Run your skin finder on the face test image set.
- Report on its performance.
Things to keep in mind:
- In addition to RGB color space, you may want to try HSV (HSI) color space, or any other color space
that strikes your fancy.
- Your k-means code should output an image that can be used
to show your clusters. This can be a grayscale image where each
pixel's value is the number of the cluster to which it has been
assigned, which the provided autocolor function will transform into something more easily interpretable. It could also be a ppm where each pixel has the mean color of the cluster it was assigned to (this generally makes a prettier picture, but it can be harder to tell the number of clusters).
What You Should Turn In
Remember that you should put headers on all your routines with the following information:
- NAME (of you)
- TITLE (of routine)
- PURPOSE (of routine)
- PARAMETERS (of routine)
Write a brief report on the performance of your smart K-means algorith as well as the skin classification algorithm, and provide examples (a few best, worst and average results will be fine). Also include
some discussions on failure examples or limitations for your approach;
this will shed light on future improvements. This report can be a Word document, pdf document or webpage.
Email me (Lillie) your writeup and all of your source code by Friday April 14th. (Whenever; I'll start grading them on Saturday) Please include "576 project 1" in the subject of the email. Please include your name in the name of your writeup file.