Name of Reviewer
------------------
Andrew Guillory

Key  Contribution
------------------
Summarize the paper's main contribution(s). Address yourself to both the class and to the authors, both of whom should be able to agree with your summary.

The paper's main contribution is a new framework and method for training cascade classifiers.  The framework is a joint optimization problem over all classifiers in the cascade.  The method iteratively cycles between training classifiers within the cascade.  The paper shows this method gives better performance than the standard greedy method and gives a proof of the method's convergence.


Novelty
--------
Does this paper describe novel work? If you deem the paper to lack novelty please cite explicitly the published prior work which supports your claim. Citations should be sufficient to locate the paper and page unambiguously. Do not cite entire textbooks without a page reference.

Yes, I believe the work is novel.  Cascade classifiers are not new, but I am not aware of a cascade classifier training method which does not use a greedy sequential approach.  It seems clear that the greedy approach is suboptimal in that it only solves local problems for each classifier, and this paper addresses this problem.


Reference to prior work
-----------------------
Please cite explicitly any prior work which the paper should cite.

The paper cites previous work with cascade classifiers.  It would be nice to include a reference giving background on SVMs, but this is not a major omission as this background information is easy to find.


Clarity
-------
Does it set out the motivation for the work, relationship to previous work, details of the theory and methods, experimental results and conclusions as well as can be expected in the limited space available? Can the paper be read and understood by a competent graduate student? Are terms defined before they are used? Is appropriate citation made for techniques used?

The paper is clear.  To understand the paper some background knowledge of cascade classifiers and linear SVMs is probably necessary, but this is reasonable to assume and references are given for previous cascade classifier work.


Technical Correctness
---------------------
You should be able to follow each derivation in most papers. If there are certain steps which make overly large leaps, be specific here about which ones you had to skip.

I could mostly follow the derivation.  I skipped the details of the convergence result, but I followed the rest and it seemed correct.  The switch from primal to dual is pretty straightforward.


Experimental Validation
-----------------------
For experimental papers, how convinced are you that the main parameters of the algorithms under test have been exercised? Does the test set exercise the failure modes of the algorithm? For theoretical papers, have worked examples been used to sanity-check theorems? Speak about both positive and negative aspects of the paper's evaluation.

In their experiments section the authors compare their method to the greedy Adaboost cascade method and SVMs.  I think the experiments are very reasonable.  One experiment I would have liked to see that the authors don't perform is using the greedy method with their linear classifiers (as opposed to Adaboost) as the base classifiers.  Adaboost cascade is not really comparable to their method in that the base classifiers used are different.  Comparing their method with a greedy linear classifier cascade would show their method is better regardless of the base classifier used.

Overall Evaluation
------------------

Overall the paper is well written and presents a novel framework and method that addresses a known deficiency with previous cascade classifier methods.

Questions and Issues for Discussion
-----------------------------------
What questions and issues are raised by this paper? What issues do you think this paper does not address well?  How can the work in this paper be extended?

Aside from the missing experiment comparing their method with a greedily trained cascade of linear classifiers, I don't have any issues with the paper.  One interesting area of future work could be exploring alternative methods for solving the joint optimization problem.  The method they use for the optimization seems to be a reasonable approach, but there are certainly other ways of performing the optimization.