TIME: 1:30-2:30 pm, Tuesday, November 18, 2008 PLACE: CSE 503 SPEAKER: Ben Recht Cal Tech TITLE: Pulling Rank: Inference from Incomplete Data ABSTRACT: How can we infer answers from a survey that is only partially filled out? Suppose we ask a large collection of individuals a series of questions. We collect some data but, unfortunately, many questions are left unanswered. Is it possible for us to make an educated guess about what the missing answers should be? How can we make such a guess? In general, with no additional assumptions, this is impossible. However, if we assume that we can arrange all of the answers into a low rank matrix, there is often a unique assignment for the missing entries. In this talk, I will discuss very general settings in which one can perfectly recover all of the missing entries of a low-rank matrix from a sufficiently large random subset of entries by solving a convex programming problem. Out of all of the matrices whose entries agree with the given subset, this program finds the matrix with the minimum nuclear norm (also known as the Ky-Fan norm or the Schatten 1-norm). I will show that if the number of sampled entries is greater than C n^{1.25} r log n for some positive numerical constant C, then with very high probability, most n x n matrices of rank r are perfectly recovered by the nuclear norm heuristic. The techniques for proving the main results build upon geometric ideas from the the recent literature on "Compressed Sensing" together with probabilistic tools such as the powerful techniques of Bourgain and of Rudelson for bounding norms of operators between Banach spaces. These results illustrate a general program for perfectly reconstructing geometric objects from very limited information. BIO: Benjamin Recht is a senior postdoctoral fellow at Center for the Mathematics of Information---a multidisciplinary research center established to promote information science and technology at Caltech. His research focuses on scalable computational tools based on convex optimization and randomized algorithms for large-scale data analysis, system identification, machine learning, and mathematical physiology. Benjamin received his B.S. with honors in Mathematics from the University of Chicago in 2000, and received a M.S. in 2002 and PhD in 2006 from the MIT Media Laboratory.