CSE 590V: Computer vision seminar

Fall 2011

Late stroll by Leonid Afremov

Course description

CSE 590V is a seminar/reading group focused on recent work in computer vision. We will cover papers from recent and upcoming conferences related to computer vision (CVPR, ICCV, ECCV, NIPS, SIGGRAPH). The seminar is open to everyone. We especially encourage first year graduate students who may be considering research in computer vision or related areas to participate.


Time: Tuesdays from 1:30pm-2:30pm

Location: CSE 403

Organizers: Neeraj Kumar (neeraj @ cs washington edu) and Bryan Russell (bcr @ cs washington edu)

Class mailing list: cse590v @ cs washington edu (subscribe here)


Each week we will cover a recent topic in computer vision by reading and discussing one or more relevant papers. A person will lead the discussion by presenting the chosen paper(s) for the week. We encourage all attendees to read the paper(s) beforehand and to actively participate in the discussion.

Each registered student will attend all classes and prepare a presentation (duration to be determined) on a selected paper(s). We will assign topics/papers during the first week based on preferences.

Each presenter will meet with the organizers the Friday before the class date to discuss the upcoming presentation, show prepared slides, and resolve any questions.


Date Topic Presenters Papers Slides
Oct 4th Datasets and active learning Neeraj and Bryan

  • Unbiased Look at Dataset Bias. Antonio Torralba, Alyosha Efros. CVPR 2011. (PDF, website)
  • A Large-scale Benchmark Dataset for Event Recognition in Surveillance Video. Sangmin Oh, Anthony Hoogs, A.G.Amitha Perera, Chia-Chih Chen, Jong Taek Lee, Jake Aggarwal, Hyungtae Lee, Larry Davis, Xiaoyang Wang, Eran Swears, Qiang Ji, Kishore Reddy, Mubarak Shah, Carl Vondrick, Hamed Pirsiavash, Deva Ramanan, Jenny Yuen, Antonio Torralba, Bi Song, Anesco Fong, Amit Roy-Chowdhury, Mita Desai. CVPR 2011. (PDF)
  • Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. Sudheendra Vijayanarasimhan, Kristen Grauman. CVPR 2011. (PDF, website)
  • Visual Recognition With Humans in the Loop. Steven Branson, Catherine Wah, Florian Schroff, Boris Babenko, Peter Welinder, Pietro Perona, Serge Belongie. ECCV 2010. (PDF, website)


vision into the wild
Oct 11th Attributes Ricardo Martin

To read:

  • Relative Attributes. Devi Parikh, Kristen Grauman. ICCV 2011. (PDF, website)


  • Automatic Attribute Discovery and Characterization. Tamara Berg, Alexander Berg, Jonathan Shih. ECCV 2010. (PDF)
  • Attribute Learning in Large-scale Datasets. O. Russakovsky and L. Fei-Fei. Workshop on Parts and Attributes, assoc. with ECCV 2010. (PDF)
  • Interactively Building a Discriminative Vocabulary of Nameable Attributes. Devi Parikh, Kristen Grauman. CVPR 2011. (PDF, website)

Oct 18th Poselets Michael Krainin

To read:

  • Describing People: A Part-Based Approach to Attribute Classification. Lubomir Bourdev, Subhransu Maji, Jitendra Malik. ICCV 2011. (PDF, website)


  • Object Segmentation by Alignment of Poselet Activations to Image Contours. Thomas Brox, Lubomir Bourdev, Subhransu Maji, Jitendra Malik. CVPR 2011. (PDF)
  • Action Recognition from a Distributed Representation of Pose and Appearance. Subhransu Maji, Lubomir Bourdev, Jitendra Malik. CVPR 2011. (PDF, website)

Oct 25th Person detection Ankit Gupta, Supasorn Suwajanakorn

To read:

  • Articulated Pose Estimation with Flexible Mixtures-of-Parts. Yi Yang, Deva Ramanan. CVPR 2011. (PDF, website)


  • Recognition Using Visual Phrases. Ali Farhadi, Mohammad Amin Sadeghi. CVPR 2011. (PDF)
  • Finding the Weakest Link in Person Detectors. Devi Parikh, Larry Zitnick. CVPR 2011. (PDF, website)

visual phrases

pose estimation
Nov 1st Events & actions Jinna Lei

To read:

  • A data-driven approach for event prediction. Jenny Yuen, Antonio Torralba. ECCV 2010. (PDF)


  • Actom Sequence Models for Efficient Action Detection. Adrien Gaidon, Harchaoui Zaid, Cordelia Schmid. CVPR 2011. (PDF)
  • Human Action Recognition by Learning Bases of Action Attributes and Parts. Bangpeng Yao, Xiaoye Jiang, Aditya Khosla, Andy Lai Lin, Leonidas J. Guibas, Li Fei-Fei. ICCV 2011. (PDF)

Nov 8th Multi-view geometry Rahul Garg, Avanish Kushal

To read:

  • Discrete-Continuous Optimization for Large-scale Structure from Motion. David Crandall, Andrew Owens, Noah Snavely, Daniel Huttenlocher. CVPR 2011. (PDF, website)


  • Semantic structure from motion. Sid Ying-Ze Bao, Silvio Savarese. CVPR 2011. (PDF, website)
  • Multi-View Reconstruction Preserving Weakly-Supported Surfaces. Michal Jancosek, Tomas Pajdla. CVPR 2011. (PDF, online demo)

Large-scale SfM

Multi-view reconstruction
Nov 15th Crowds & videos / social networks Aditya Sankar, Ezgi Mercan

To read:

  • Data-driven Crowd Analysis in Videos. Mikel Rodriguez, Josef Sivic, Ivan Laptev, Jean-Yves Audibert. ICCV 2011. (PDF)
  • Seeing with Social Context: Recognizing People and Social Relationships. Gang Wang, Andrew Gallagher, Jiebo Luo, David Forsyth. ECCV 2010. (PDF)


  • Density-aware person detection and tracking in crowds. Mikel Rodriguez, Ivan Laptev, Josef Sivic,Jean-Yves Audibert. ICCV 2011. (PDF)


face social context
Nov 22nd Misc/cool papers Daniel Leventhal

To read:

  • Motion Denoising with Application to Time-lapse Photography. Michael Rubinstein, Ce Liu, Bill Freeman. CVPR 2011. (PDF, website)


  • Microgeometry Capture Using an Elastomeric Sensor. Micah Johnson, Forrester Cole, Alvin Raj, Edward Adelson. SIGGRAPH 2011. (PDF, website)
  • Wide-angle Micro Sensors for Vision on a Tight Budget. Sanjeev Koppal, Todd Zickler, Ioannis Gkioulekas. CVPR 2011. (PDF, website)

time lapse
Nov 29th Shading and lighting Aaron Bauer, Dan Butler

To read:

  • Rendering Synthetic Objects into Legacy Photographs. K. Karsch and V. Hedau and D. Forsyth and D. Hoiem. SIGGRAPH Asia 2011. (PDF, website)
  • Shape Estimation in Natural Illumination. Micah Johnson, Edward Adelson. CVPR 2011. (PDF, website)


  • High-Frequency Shape and Albedo from Shading using Natural Image Statistics. Jonathan Barron, Jitendra Malik. CVPR 2011. (PDF)
  • Single-Image Shadow Detection and Removal using Paired Regions. Ruiqi Guo, Qieyun Dai, Derek Hoiem. CVPR 2011. (PDF, website)

shape estimation
Dec 6th RGB-D perception Peter Henry

To read:

  • KinectFusion: Real-time 3D Reconstruction and Interaction Using a Moving Depth Camera Shahram Izadi, Richard Newcombe, David Kim, Otmar Hilliges, David Molyneaux, Steve Hodges, Pushmeet Kohli, Andrew Davison, and Andrew Fitzgibbon. To appear at UIST 2011. (PDF | ISMAR paper)


  • Real-time Human Pose Recognition in Parts from Single Depth Images. Jamie Shotton, Andrew Fitzgibbon, Mat Cook, Andrew Blake. CVPR 2011. (PDF)
  • Indoor Scene Segmentation using a Structured Light Sensor. Nathan Silberman and Rob Fergus. 3DRR at ICCV 2011. (PDF, website)
  • Semantic Labeling of 3D Point Clouds for Indoor Scenes. Hema Koppula, Abhishek Anand, Thorsten Joachims, Ashutosh Saxena. NIPS 2011. (website)


Topics not covered in class

1. Scene understanding

  • Understanding scenes on many levels. J. Tighe and S. Lazebnik. ICCV 2011. (PDF)
  • Characterizing Structural Relationships in Scenes Using Graph Kernels. M. Fisher, M. Savva, P. Hanrahan. SIGGRAPH 2011. (PDF)
  • Scene Recognition and Weakly Supervised Object Localization with Deformable Part-Based Models. M. Pandey and S. Lazebnik. ICCV 2011. (PDF)

2. Large scale recognition

  • Combining Randomization and Discrimination for Fine-Grained Image Categorization. Bangpeng Yao, Aditya Khosla, Li Fei-Fei. CVPR 2011. (PDF)
  • Iterative Quantization: A Procrustean Approach to Learning Binary Codes. Yunchao Gong and Svetlana Lazebnik. CVPR 2011. (PDF, website)
  • Learning to Share Visual Appearance for Multiclass Object Detection. Ruslan Salakhutdinov, Antonio Torralba, Josh Tenenbaum. CVPR 2011. (PDF)

3. Learning

  • Parsing Natural Scenes and Natural Language with Recursive Neural Networks. Richard Socher, Cliff Chiung-Yu Lin, Andrew Y. Ng, Christopher D. Manning. ICML 2011. (PDF)
  • Learning Image Representations from the Pixel Level via Hierarchical Sparse Coding. Kai Yu, Yuanqing Lin, and John Lafferty. CVPR 2011. (PDF)
  • Unsupervised Learning of Hierarchical Representations with Convolutional Deep Belief Networks. Honglak Lee, Roger Grosse, Rajesh Ranganath, and Andrew Y. Ng. CACM 2011. (PDF)
  • Learning Convolutional Feature Hierarchies for Visual Recognition. K. Kavukcuoglu, P. Sermanet, Y. Boureau, K. Gregor, M. Mathieu, Y. LeCun. NIPS 2010. (PDF)

4. Cross-domain/multi-modal learning & matching

  • Data-driven Visual Similarity for Cross-domain Image Matching. Abhinav Shrivastava, Tomasz Malisiewicz, Abhinav Gupta, Alexei A. Efros. SIGGRAPH Asia 2011. (website)
  • Multimodal Templates for Real-Time Detection of Texture-less Objects in Heavily Cluttered Scenes. Stefan Hinterstoisser, Stefan Holzer, Cedric Cagniart, Slobodan Ilic, Kurt Konolige, Nassir Navab, Vincent Lepetit. ICCV 2011. (PDF, website)

5. Language

  • Baby Talk: Understanding and Generating Image Descriptions. Girish Kulkarni, Visruth Premraj, Sagnik Dhar, Siming Li, Alexander Berg, Yejin Choi, Tamara Berg. CVPR 2011. (PDF)
  • Evaluating Knowledge Transfer and Zero-Shot Learning in a Large-Scale Setting. Marcus Rohrbach, Michael Stark, Bernt Schiele. CVPR 2011. (PDF, website)

6. Cognitive science & saliency

  • What makes an image memorable? Phillip Isola, Jianxiong Xiao, Aude Oliva, Antonio Torralba. CVPR 2011. (PDF, website)
  • Fixations on Low-Resolution Images. T. Judd, F. Durand, A. Torralba. Journal of Vision 2011. (PDF, website, game)