Lectures: Tuesdays 6:30-9:20pm
Office Hours: Thursdays 5:30-6:30pm, or by appointment
TAs: Nishat Khan, Dianqi Li
A masters course in computer vision, emphasizing fundamentals of geometry and image formation as well as deep learning and image understanding.
Grading: The grade is based on four projects (equally weighted). Each project will be a mix of coding and written answers. See the course overview below for handin dates. Late policy is as follows: Late assignments will be accepted up to 5 days after the deadline, but with 10% subtracted from the mark per day.
Collaboration Policy: Discussing ideas or general strategies is fine, but students should not share solutions or code.
Date | Lecture | Description | Notes and Resources |
---|---|---|---|
3/31 | Introduction | Week 1 Notes | |
Image Formation | Geometric and Photometric Image Formation, Pinhole Camera, Lenses, Sensors, Colour, Gamma, DCT, Image Coding | ||
4/7 | Filtering and Pyramids | Linear + Non-Linear Filtering, Correlation, Convolution, Gaussian + Laplacian Pyramids, Sampling and Aliasing | Week 2 Notes Project 1 start |
Features and Matching | Detection, Correspondence, Edges, Corners, Regions, Patch Matching, SIFT, Shape Context, Learning Features | ||
4/14 | Planar Geometry | 2D Transforms: Euclidean, Similarity, Affine, Projective, Camera Models: Perspective, Projective, Linear, Viewing planes, Lines and Camera Rotation | Week 3 Notes |
RANSAC | Least Squares 2-view Alignment, Outliers, Robust Line Fitting, RANSAC, Minimal Subsets | ||
4/21 | Epipolar Geometry | Epipolar Lines, Plane Constraint, Fundamental/Essential Matrix, 8 point algorithm, Triangulation, 2-view SFM | Week 4 Notes Project 2 start |
Multiview Alignment and SFM | Multiview Alignment, Residuals, Error Function, Structure from Motion, Bundle Adjustment, Pose Estimation, Triangulation | ||
4/26 | Project 1 due | ||
4/28 | Stereo | Stereo matching, local + global, multiview stereo, plane sweep, volumetric, depth map merging, photometric stereo | Week 5 Notes |
Depth + Flow | Depth imaging + fusion, signed distance functions, non-rigid matching, optical flow, Lucas Kanade algorithm | PlaneSweep ipynb, LucasKanade ipynb. Notebooks by Steven Lovegrove, Richard Newcombe |
|
5/5 | Linear Classification | Visual classification intro, object recognition, instance, category, classification vs detection, linear classification, 2-class, N-class, linear and softmax regression | Week 6 Notes Project 3 start, Part 1 and Part 2 |
Visual Classification 2 | Fundamentals and Pre-Deep Learning Classification, Bayesian classifiers, Gaussian distributions, PCA, LDA, Decision Forests, Visual words, SVMs | ||
5/10 | Project 2 due | ||
5/12 | Neural Networks | Feature extraction, end to end learning, multiple linear layers, activation functions, biological neurons, space warping, universal approximation, convex optimization | Week 7 Notes Slides for Week 7 by Justin Johnson |
Backpropagation | Chain rule, computational gradients, forward/reverse mode autodiff, upstream/local gradients, flat backprop, modular design, scalar/vector/tensor backprop, matrix multiplication example | ||
Convolutional Networks | Convolutional layers, activation maps, dimension mappings, receptive fields, strides, pooling, LeNet5 example | ||
5/19 | Advanced CNNs | CNN building blocks, dropout, batch norm, factorized convolutions, residual connections, popular architectures: AlexNet, VGG, GoogLeNet, Resnet, MobileNet, SE-Net | Project 4 start |
Object Detection | Motivation + applications, sliding windows, anchor based detection, single-stage and two-stage architectures, evaluation metrics, IoU, precision-recall, mAP, practical tips | ||
5/24 | Project 3 due | ||
5/26 | Segmentation | Dense prediction, semantic, instance, panoptic segmentation, keypoint estimation, fully convolutional nets, atrous, transpose convolution | |
Single-View Depth, Superres, Colorization | Pixel labelling, single-view depth estimation, direct, self-supervision, super-resolution, colorization, image translation | ||
6/2 | Deep Learning in 3D | Single-view, 2-view, multi-view depth, deep learning with points, meshes, voxels, SDFs, neural scene representation and rendering | Week 10 Notes |
Image Generation and GANs | Loss functions: L2, VGG, adversarial, texture synthesis, style transfer, generative adversarial nets, image generation, conditional GANs, image translation, pix2pix | ||
Project 4 due |