Date | Content | Reading | Slides and Notes |
---|---|---|---|
Intro | |||
9/25 Th | Introduction, neural network basics |
chapter 1-4 of Dive into Deep Learning, Zhang et al
https://playground.tensorflow.org/ |
Lecture 1, Lecture 1 (annotated) |
Approximation Theory | |||
9/30 Tu | 1D and multivariate approximation (Zoom) | Chapter 1,2 of Matus Telgarsky's notes | Lecture 2 , Lecture 2 (annotated), scirbed notes on 1D, multivariate approximation, and Barron's Theory. |
10/2 Th | Barron's theory (Zoom) | Chapter 3,5 of Matus Telgarsky's notes | Lecture 3 , Lecture 3 (annotated) |
Optimization | |||
10/7 Tu | Depth separation, backpropagation, auto-differentiation, | Chapter 4 of Dive into Deep Learning, Zhang et al , Chapter 9 of Matus Telgarsky's notes | Lecture 4, Lecture 4 (annotated) |
10/9 Th | Clarke differential, auto-balancing | Chapter 9 of Matus Telgarsky's notes, Chapter 12 of Dive into Deep Learning, Zhang et al. , Du et al. on auto-balancing, Optimizer visualization | Lecture 5, Lecture 5 (annotated), scribed notes on Clarke differential, positive homogeneity and auto-balancing |
10/14 Tu | Advanced optimizers | Chapter 12 of Dive into Deep Learning, Zhang et al. | Lecture 6 , Lecture 6 (annotated) |
10/16 Th | Important techniques for improving optimization, optimization landscape | He et al. on Kaiming initialization, blog of escaping saddle points, blog on how to escape saddle points efficiently | Lecture 7 |
10/21 Tu | Global convergence of gradient descent for over-parameterized neural networks | Du et al. on global convergence of gradient descent | |
Generalization | |||
10/23 Th | Neural tangent kernel, measures of generalzation, techniques for improving generalization, generalization theory for deep learning | Jacot et al. on Neural Tangent Kernel, Arora et al. on Neural Tangent Kernel, Zhang et al. on rethinking generalization on deep learning, | |
10/28 Tu | Generalization theory for deep learning, separation between neural network and kernel, double descent, implicit bias | Chapter 10 - 14 of Matus Telgarsky's notes, Jiang et al. on different generalization measures, Belkin et al. on double descent, Allen-Zhu and Li on separation beteween neural networks and kernels | |
Neural Network Architecture | |||
10/30 Th | Introduction to convolutional neural networks, advanced convolutional neural networks | Chapter 7,8 of Dive into Deep Learning, Zhang et al. | |
11/4 Tu | Recurrent neural networks, LSTM | Chapter 9, 10 of Dive into Deep Learning, Zhang et al. | |
11/6 Th | Attention mechanism | Chapter 11 of Dive into Deep Learning, Zhang et al. | |
Representation learning and generative models | |||
11/11 Tu | Veterans Day | ||
11/13 Tu | Desiderata for representation learning, Self-supervised learning, | Bengio et al. on representation learning | |
11/18 Tu | Contrastive learning, CLIP, desiderata for generative models | Chapter 20 of Dive into Deep Learning, Zhang et al., CLIP paper | |
11/20 Tu | GAN, Variational autoencoder, | Chapter 20 of Dive into Deep Learning, Zhang et al. | |
11/26 Th | Energy-based models, normalizing flows, score-based models, diffusion models | Yang Song's blog on score-based models, Lilian Weng's blog on diffusion models. | |
11/27 Th | Thanksgiving | ||
Course Presentations | |||
12/2 Tu | Project Presentation (on Zoom) | ||
12/4 Th | Project Presentation (on Zoom) |