Lectures: Wednesdays 6:30-9:20pm
Location and Attendance Options:
Attendance via livestream or watching posted lecture recordings [link] is possible, however these are the less-preferred modalities.
Office Hours: see Canvas
TAs: Kalyani Marathe, Svetoslav Kolev
A masters course in computer vision, emphasizing fundamentals of geometry and image formation as well as deep learning and image understanding.
Grading, Late Policy, and Collaboration Policy: see Canvas
Date | Lecture | Description | Notes and Resources |
---|---|---|---|
9/29 | Introduction | [CVA2] Ch.1 | |
Image Formation | Geometric and Photometric Image Formation, Pinhole Camera, Lenses, Sensors, Colour, Gamma, DCT, Image Coding | [CVA2] Ch.2 | |
10/6 | Filtering and Pyramids | Linear + Non-Linear Filtering, Correlation, Convolution, Gaussian + Laplacian Pyramids, Sampling and Aliasing | [CVA2] Ch. 3.2, 3.5 |
Features and Matching | Detection, Correspondence, Edges, Corners, Regions, Patch Matching, SIFT, Shape Context, Learning Features | [CVA2] Ch. 7
Project 1 start |
|
10/13 | Planar Geometry | 2D Transforms: Euclidean, Similarity, Affine, Projective, Camera Models: Perspective, Projective, Linear, Viewing planes, Lines and Camera Rotation | [CVA2] Ch. 3.6 |
RANSAC | Least Squares 2-view Alignment, Outliers, Robust Line Fitting, RANSAC, Minimal Subsets | [CVA2] Ch. 8.1, 8.2 | |
10/20 | Epipolar Geometry | Epipolar Lines, Plane Constraint, Fundamental/Essential Matrix, 8 point algorithm, Triangulation, 2-view SFM | Project 2 start
[CVA2] Ch. 11.3 |
Multiview Alignment and SFM | Multiview Alignment, Residuals, Error Function, Structure from Motion, Bundle Adjustment, Pose Estimation, Triangulation | [CVA2] Ch. 8.3, 8.4, 11.4 [Panorama stitching by Brown & Lowe] [ORB-SLAM by Mur-Artal et al.] |
|
10/25 | Project 1 due | ||
10/27 | Stereo | Stereo matching, local + global, multiview stereo, plane sweep, volumetric, depth map merging, photometric stereo | [CVA2] Ch. 12 |
Depth + Flow | Depth imaging + fusion, signed distance functions, non-rigid matching, optical flow, Lucas Kanade algorithm | [CVA2] Ch. 13.[1,2,3,5], Ch. 9.1; PlaneSweep ipynb, LucasKanade ipynb. Notebooks by Steven Lovegrove, Richard Newcombe |
|
11/3 | Linear Classification | Visual classification intro, object recognition, instance, category, classification vs detection, linear classification, 2-class, N-class, linear and softmax regression | [CVA2] Ch. 6.1, 6.2; [ESL] Ch. 2.3 Project 3 start |
Visual Classification 2 | Fundamentals and Pre-Deep Learning Classification, Bayesian classifiers, Gaussian distributions, PCA, LDA, Decision Forests, Visual words, SVMs | [DL] Ch. 5 | |
11/8 | Project 2 due | ||
11/10 | Neural Networks | Feature extraction, end to end learning, multiple linear layers, activation functions, biological neurons, space warping, universal approximation, convex optimization | [CVA2] Ch. 5.3, 5.4.0, 5.4.1; [DL] Ch. 6; [Slides for Week 7 by Justin Johnson] |
Backpropagation | Chain rule, computational gradients, forward/reverse mode autodiff, upstream/local gradients, flat backprop, modular design, scalar/vector/tensor backprop, matrix multiplication example | ||
Convolutional Networks | Convolutional layers, activation maps, dimension mappings, receptive fields, strides, pooling, LeNet5 example | ||
11/17 | Advanced CNNs | CNN building blocks, dropout, batch norm, factorized convolutions, residual connections, popular architectures: AlexNet, VGG, GoogLeNet, Resnet, MobileNet, SE-Net | Project 4 start: assignment PDF and starter code |
Object Detection | Motivation + applications, sliding windows, anchor based detection, single-stage and two-stage architectures, evaluation metrics, IoU, precision-recall, mAP, practical tips | [CVA2] Ch. 6.3 [Slides for Week 8 by Jonathan Huang] |
|
11/22 | Project 3 due | ||
11/24 | NO CLASS | ||
12/1 | Tracking, Part 1 | Motivation, probabilistic formulation, linear dynamical systems, multiple-hypothesis tracking (MHT), Bayesian filtering with CONDENSATION and RJ MCMC | [Course on Tracking at Linköping University] [Course on SLAM and Tracking at University of Freiburg] |
Tracking, Part 2 | Case studies: tracking as online learning, correlation filters, tracking with Siamese networks, graph-theoretic formulations | ||
12/8 | Vision and Language | "Visual Tracking and Retrieval by Natural Language Descriptions" Guest lecture by Qi (Fred) Feng, Boston University |
[CVA2] Ch. 6.6 [Vision & Language] [Real-time Tracking with NL] [Siamese Natural Language Tracker] |
Deep Learning in 3D | Single-view, 2-view, multi-view depth, deep learning with points, meshes, voxels, SDFs, neural scene representation and rendering | ||
Project 4 due | [Buried in Syllabus, Prize Remains Unfound] |