Question Set 2: Cropping and Finding Faces

All required photos are shown below. Min step, max step, and step size are shown in parentheses.

Unless otherwise stated, the photos below were captured using this error forumla to score a face:
error = MSE * (face_distance_from_average) / (face_variance * face_variance)
However, for some photos I use this forumla instead (I make this clear when I use this):
error = MSE * (face_distance_from_average) / face_variance
The photos using this second forumla are marked below. My turned in version uses the first forumla because it works better on the group photos we are being tested on. However, I found that second forumla works better in some instances, as I will discuss below.

Cropping the Test Photo (.8 .86 .05)




Marking Group 7 (.8 .86 .05)


Using e = (MSE*dist) / var^2 (forumla 1):

Using e = (MSE*dist) / var (forumla 2):


Marking Group 4 (.8 .86 .05)


Using e = (MSE*dist) / var^2 (forumla 1):

Using e = (MSE*dist) / var (forumla 2):


Cropping Myself (.3 .3 .05)

At first I tried using larger scales (.7 - 1), and while I was cropping a portion of my face, it wasn't my entire face. Reducing the scale down to .3 proved to be a good value that captured the majority of my face.



Marking a Group of Friends And I(.3 .85 .05)


Using e = (MSE*dist) / var^2 (forumla 1):

Using e = (MSE*dist) / var (forumla 2):
Notice this is better using the second forumla!

I had to use a wide range of scales to get the best results in this image. Observe that a lot of the heads are different sizes. For example, the girl in red in the back row has an extremely small face compared to the guy in the white in the first row. Using a wide range allowed me to best capture this disparity. Also notice which faces were not captured. The two girls in the back row have hair partially covering their face which may have messed things up. The guy in the back middle also has hair covering his entire forehead and parts of his eye and eyebrow. The guy in the white in the front row has a pretty large forehead (it doesn't look like this normally!), which may have screwed with the algorithm. The guy on the far right has a slanted face (he wasn't captured).



Discussion of False Positives

Forumla 1: (MSE * diff) / var^2
Forumla 2: (MSE * diff) / var

I found these results very interesting. In the class group photos (group 7 and group 4) we are better off using the variance squared version of the forumla (4 versus 3 correct faces out of 4). However, in the larger group photo of my friends the variance squared version does significanly worse (2 versus 6 correct faces out of 11). Why would this be?

Here are the results from the group 7 photo version using the second forumla:
Top 4 faces:
(error:120.963, sx:71, sy:118, s:0.8, mse:433.146, diff:949.439, var:3399.75)
(error:147.298, sx:175, sy:123, s:0.85, mse:518.096, diff:930.635, var:3273.34)
(error:207.054, sx:277, sy:129, s:0.85, mse:532.915, diff:977.71, var:2516.42)
(error:256.017, sx:416, sy:83, s:0.8, mse:708.639, diff:860.621, var:2382.15) -> FALSE POSITIVE
Result number four is the rectangle on the wall to the far right (the false positive). You can tell this because its scaled x value is the farthest right. Notice that for this rectangle the variance is smallest out of the four, but not by much. This makes sense because it is an image of a single colored wall, however, we might expect the difference in variance between the wall and a face to be greater. If we increase the weight of variance to variance squared, we achieve the effect of pronouncing smaller differences in variances and penalize walls by a greater amount. When switching over to forumla 1 we observe that all four faces are properly identified instead of just three faces and the wall. It seems as this change is for the best.

However, observe what happens in the group photo of my friends. The variance squared version does much worse! Observe the differences between the false positives in the two photos. Using forumla 1 all the false positives are areas of high constract, which means large variance values. Using forumla 2 we are less sensitive to variance and are better able to detect faces.

My conclusion here is that there is a fine balance to be acheived when scoring faces in the algorithm. If you put two much emphasis on a single quality, like variance, you don't do well when the background of a photo that has a lot of contrast. If you know the background is a solid wall, you want to emphasize variance because it will be more pronounced in the faces and not in the background. I have a hunch that implementing a skin classifier in addition to this algorithm would go a long way to aiding the results.

Back