- Time
- 1:30 – 2:20pm, Tuesday, November 3, 2009
- Place
- CSE 503
- Speaker
- Piotr Indyk, MIT

Over the recent years, a new *linear* method for compressing high-dimensional
data (e.g., images) has been discovered. For any high-dimensional vector x, its
*sketch* is equal to Ax, where A is an m x n matrix (possibly chosen at random).
Although typically the sketch length m is much smaller than the number of
dimensions n, the sketch contains enough information to recover an
*approximation* to x. At the same time, the linearity of the sketching method
is very convenient for many applications, such as data stream computing and
compressed sensing.

The major sketching approaches can be classified as either combinatorial (using sparse sketching matrices) or geometric (using dense sketching matrices). They achieve different trade-offs, notably between the compression rate and the running time. Thus, it is desirable to understand the connections between them, with the goal of obtaining the "best of both worlds" solution.

Several recent results established such connections, indicating that the two approaches are just different manifestations of the same underlying phenomenon. This enabled the development of novel algorithms, including the first algorithms that provably achieve the (asymptotically) optimal compression rate and near-linear recovery time simultaneously.

In this talk we give an overview of the results in the area, as well as look at some of them in more detail. In particular, we will describe a new algorithm, called "Sequential Sparse Matching Pursuit (SSMP)". In addition to having the aforementioned theoretical guarantees, the algorithm works well on real data, with the recovery quality often outperforming that of more complex algorithms, such as l_1 minimization.

Joint work with: Radu Berinde, Anna Gilbert, Howard Karloff, Milan Ruzic and Martin Strauss.