Tentative Schedule

Date Content Reading Slides and Notes
Intro
3/29 Tu Introduction, neural network basics chapter 1-4 of Dive into Deep Learning, Zhang et al
https://playground.tensorflow.org/
Lecture 1, Lecture 1 (annotated)
Approximation Theory
3/31 Th 1D and multivariate approximation Chapter 1,2 of Matus Telgarsky's notes Lecture 2, Lecture 2 (annotated), scirbed notes on 1D and multivariate approximation
4/5 Tu Barron's theory, depth separation Chapter 3,5 of Matus Telgarsky's notes Lecture 3, Lecture 3 (annotated)
Optimization
4/7 Th Backpropagation, auto-differentiation Chapter 4 of Dive into Deep Learning, Zhang et al Lecture 4, Lecture 4 (annotated)
4/12 Tu Clarke differential, auto-balancing Chapter 9 of Matus Telgarsky's notes, Chapter 11 of Dive into Deep Learning, Zhang et al. , Du et al. on auto-balancing Lecture 5, Lecture 5 (annotated), scribed notes on Clarke differential, positive homogeneity and auto-balancing
4/14 Th Advanced optimizers, techniques for improving optimization Optimizer visualization , Chapter 11 of Dive into Deep Learning, Zhang et al. Lecture 6, Lecture 6 (annotated)
4/19 Tu Techniques for improving optimization, optimization landscape He et al. on Kaiming initialization, blog of escaping saddle points, blog on how to escape saddle points efficiently Lecture 7, Lecture 7 (annotated)
4/21 Th Optimization landscape (cont'd), global convergence of gradient descent Du et al. on global convergence of gradient descent Lecture 8, Lecture 8 (annotated)
Generalization
4/26 Tu Finish the proof of global convergence of gradient descent, Neural tangent kernel, measures of generalzation Jacot et al. on Neural Tangent Kernel, Arora et al. on Neural Tangent Kernel, Zhang et al. on rethinking generalization on deep learning Proof of global convergence of graident descent (annotated), Lecture 9, Lecture 9 (annotated), scribed notes on global convergence of gradient descent
4/28 Th Techniques for improving generalization, generalization bounds for deep learning, Chapter 4 of Dive into Deep Learning, Zhang et al. , Chapter 11 - 13 of Matus Telgarsky's notes Lecture 10, Lecture 10 (annotated)
5/3 Tu Norm-based generalization bounds, double descent, implicit bias Chapter 10,14 of Matus Telgarsky's notes, Jiang et al. on different generalization measures, Belkin et al. on double descent Lecture 11, Lecture 11 (annotated)
Neural Network Architecture
5/5 Th Separation between neural network and kernel, intro to convolutional neural network Chapter 6,7 of Dive into Deep Learning, Zhang et al. Lecture 12, Lecture 12 (annotated), scribed notes on separation between NN and kernel
5/10 Tu Convolutional neural networks, recurrent neural networks Chapter 6,7, 8 of Dive into Deep Learning, Zhang et al. Lecture 13, Lecture 13 (annotated)
5/12 Th LSTM, attention mechanism Chapter 9, 10 of Dive into Deep Learning, Zhang et al. Lecture 14, Lecture 14 (annotated)
Representation Learning
5/17 Tu Transformer, Desiderata for representation learning, theory for multi-task representation learning Chapter 10 of Dive into Deep Learning, Zhang et al., Du et al. on representation learning, Tripuraneni et al. on representation learning Lecture 15 on transformer, Lecture 15 on representation learning, Lecture 15 on transformer (annotated), Lecture 15 on representation learning (annotated)
5/19 Th Word embedding, auto-encoder, self-supervised learning, contrastive learning Bengio et al. on representation learning Lecture 16, Lecture 16 (annotated)
Generative Models
5/24 Tu Desiderata for generative models, GAN (on Zoom) Chapter 17 of Dive into Deep Learning, Zhang et al. Lecture 17, Lecture 17 (annotated)
5/26 Th Variational autoencoder, energy models, normalizing flows (on Zoom) Lecture 18, Lecture 18 (annotated)
Course Presentations
5/31 Tu Project Presentation (on Zoom)
6/2 Th Project Presentation (on Zoom)