Speaker | Eli Schechtman |
Date | March 02, 2007 |
Time | 3:00PM to 4:00PM |
Place | GRAIL (CSE 291) |
Analysis and detection of objects in images or actions in video sequences require a complex notion of similarity across visual data. Existing approaches are often based on extracting informative parameters or models learned from many prior examples of the visual data (object or action) of interest. These approaches, however, are often restricted to a small set of pre-defined classes of visual data, and do not generalize to scenarios with unfamiliar objects/actions. Moreover, in many realistic cases and problems, one has only a single example of the object or action of interest, or even NO explicit example whatsoever of what he is looking for.
In this talk I will show how we can infer about the global similarity of
different complex visual data, by employing local similarities within
and
across these visual data. I will demonstrate the power of this approach
through several example problems. These include:
(i) Prediction of missing visual information in images and videos.
(ii) Detection and retrieval of complex objects in cluttered images
using a
single example -- often only a rough hand-sketch of the object of
interest.
(iii) Detection of complex actions performed by differently dressed
people
against different backgrounds, based on a single example clip (without
requiring fg/bg segmentation or motion estimation).
Joint work with Michal Irani (the first part also with Yonatan Wexler).