Tentative Schedule

Date Content Reading Slides and Notes
Intro
9/26 Th Introduction, neural network basics (Zoom Recording) chapter 1-4 of Dive into Deep Learning, Zhang et al
https://playground.tensorflow.org/
Lecture 1, Lecture 1 (annotated)
Approximation Theory
10/1 Tu 1D and multivariate approximation Chapter 1,2 of Matus Telgarsky's notes Lecture 2 , Lecture 2 (annotated), scirbed notes on 1D, multivariate approximation, and Barron's Theory.
10/3 Th Barron's theory Chapter 3,5 of Matus Telgarsky's notes Lecture 3 , Lecture 3 (annotated)
Optimization
10/8 Tu Depth separation, backpropagation, auto-differentiation, Chapter 4 of Dive into Deep Learning, Zhang et al , Chapter 9 of Matus Telgarsky's notes Lecture 4, Lecture 4 (annotated)
10/10 Th Clarke differential, auto-balancing Chapter 9 of Matus Telgarsky's notes, Chapter 12 of Dive into Deep Learning, Zhang et al. , Du et al. on auto-balancing, Optimizer visualization Lecture 5, Lecture 5 (annotated), scribed notes on Clarke differential, positive homogeneity and auto-balancing
10/15 Tu Advanced optimizers, Chapter 12 of Dive into Deep Learning, Zhang et al. Lecture 6, Lecture 6 (annotated)
10/17 Th Important techniques for improving optimization, optimization landscape He et al. on Kaiming initialization, blog of escaping saddle points, blog on how to escape saddle points efficiently Lecture 7
10/22 Tu Global convergence of gradient descent for over-parameterized neural networks Du et al. on global convergence of gradient descent
Generalization
10/24 Th Neural tangent kernel, measures of generalzation, techniques for improving generalization, Jacot et al. on Neural Tangent Kernel, Arora et al. on Neural Tangent Kernel, Zhang et al. on rethinking generalization on deep learning,
10/29 Tu Generalization theory for deep learning, separation between neural network and kernel Chapter 10 - 14 of Matus Telgarsky's notes, Jiang et al. on different generalization measures, Belkin et al. on double descent, Allen-Zhu and Li on separation beteween neural networks and kernels
Neural Network Architecture
10/31 Th Double descent, implicit bias, introduction to convolutional neural networks, advanced convolutional neural networks Chapter 7,8 of Dive into Deep Learning, Zhang et al.
11/5 Tu Recurrent neural networks, LSTM Chapter 9, 10 of Dive into Deep Learning, Zhang et al.
11/7 Th Attention mechanism, desiderata for representation learning Chapter 11 of Dive into Deep Learning, Zhang et al., Bengio et al. on representation learning
Representation learning, Pre-training, Fine-tuning
11/12 Tu Self-supervised learning, contrastive learning Chapter 11 of Dive into Deep Learning, Zhang et al.
11/14 Th Deep reinforcement learning, decision transformer
Generative models
11/19 Tu Desiderata for generative models, GAN Chapter 20 of Dive into Deep Learning, Zhang et al.
11/21 Tu Variational autoencoder, energy models Chapter 20 of Dive into Deep Learning, Zhang et al.
11/26 Th Normalizing flows, score-based models, diffusion models Yang Song's blog on score-based models, Lilian Weng's blog on diffusion models.
11/28 Th Thanksgving
Course Presentations
12/3 Tu Project Presentation (on Zoom)
12/5 Th Project Presentation (on Zoom)