CSE 455 Autumn 2010: HW4
Skin Finding

Date released: Wednesday, October 27, 2010

Date due: Wednesday, Nov 3, 2010 11.59pm

(Late policy: 5% off per day late till Nov 5 )

This assignment follows from HW3. No additional software is provided or needed.

Download the images you need:
    1. Faces training set
    2. Faces testing set
    3. Faces training groundtruth set
    4. Faces testing groundtruth

In this assignment, you will use the best clustering algorithm you developed for HW3 to find clusters in normalized r-g space and use them to classify the skin areas on images containing faces. You will use training data to train the Naive Bayes classifier in the WEKA package to determine which clusters are skin pixels and which are not. You will then apply the trained classifier to the test data set and calculate your percentage of accuracy, using the groundtruth we provide.


face image	segmented by color	skin pixels highlighted

What You Should Do

The goal of this assignment is to develop a very simple skin detector from the results of HW3. For this part of the homework, you will work in normalized RGB space.

The common RGB representation of color images is not suitable for characterizing skin-color. In the RGB space, the triple component (R, G, B) represents not only color, but also luminance. Luminance may vary across a person's face due to the ambient lighting and is not a reliable measure in separating skin from non-skin regions. Luminance can be removed from the color representation in the normalized RGB space or chromatic color space. Chromatic colors, also known as "pure" colors in the absence of luminance, are defined by the simple normalization process shown below:

r = R/(R+G+B)

g = G/(R+G+B)

Note: r+g+b = 1, so you don't need b.

Start with the face training image set (face-training.zip).
Run your K-means algorithm on the face training set to get K clusters with small K, ie K < 9.
Keep the information of the cluster index for each pixel and the cluster centroid (average r and g value).
Use the groundtruth images (face-training-grountruth.zip) to assign the final cluster label (skin or non skin) for training. You can use majority vote of the pixels in each cluster as the cluster label.
Use a classifier to learn the skin model. You can use WEKA and test out your model using the Naive Bayes classifier and at least one other. A short WEKA tutorial is provided here You will need to generate your training and testing data file in the proper ARFF format. The training and testing data file will contain the cluster centroid and the cluster label (Sample of training and testing ARFF file). You want to obtain a skin model with as high classification accuracy as possible.
Test your skin model on both the face training image set and the face test image set. Since you trained the classifier based on clusters, you will also test it based on clusters. The classifier will give you the confusion matrix and cluster accuracy score for that image in terms of correct and incorrect cluster labels..
Report on its performance: classification accuracy and include images of results as well. For the output image results, you can use binary images or you can use color images and turn the pixels of the clusters classified as skin to a color that stands out from the rest. For the accuracy you report, please compute pixel accuracy for each image, rather than cluster label accuracy. You merely label the pixels of the image with their cluster labels and use the groundtruth to determine what percentage of the pixels have correct labels. Finally, make sure you include accuracy and image results for the following face images: face01, face04, face05, face08, face10, face23, face28.

What You Should Turn In

1. All of your code that is created for the above assignment. Your code must be well commented and in the ASCII format so that the grader can compile it to working binaries.
You should put headers on all your routines with the following information:

Your NAME
DATE
TITLE of the routine
DESCRIPTION of the routine
PARAMETERS of the routine

2. Write a brief report on the performance of your skin classification. Provide your classification accuracy and some image results. Also include some discussions on failure examples or limitations of the method.

This report can be a Word document or pdf document. HTML or webpages are not accepted. Your report must include output images of your algorithm.

Download .doc template for the report (HW4-report.doc). You are free to use other text processing tools like latex etc, however make sure that you have the same sections in your report.

Evaluation

Download grading guidelines here.

Please email your code and report to Alfred (alfredg@cs) in a zip file with your name as the zip file name e.g. JohnDoe.zip, by Wednesday, Nov 3, 2010 11.59pm.

CSE 455 Autumn 2010: HW4 Skin Finding