Tentative Schedule

Date	Content	Reading	Slides and Notes
	Intro
3/29 Tu	Introduction, neural network basics	chapter 1-4 of Dive into Deep Learning, Zhang et al https://playground.tensorflow.org/	Lecture 1, Lecture 1 (annotated)
	Approximation Theory
3/31 Th	1D and multivariate approximation	Chapter 1,2 of Matus Telgarsky's notes	Lecture 2, Lecture 2 (annotated), scirbed notes on 1D and multivariate approximation
4/5 Tu	Barron's theory, depth separation	Chapter 3,5 of Matus Telgarsky's notes	Lecture 3, Lecture 3 (annotated)
	Optimization
4/7 Th	Backpropagation, auto-differentiation	Chapter 4 of Dive into Deep Learning, Zhang et al	Lecture 4, Lecture 4 (annotated)
4/12 Tu	Clarke differential, auto-balancing	Chapter 9 of Matus Telgarsky's notes, Chapter 11 of Dive into Deep Learning, Zhang et al. , Du et al. on auto-balancing	Lecture 5, Lecture 5 (annotated), scribed notes on Clarke differential, positive homogeneity and auto-balancing
4/14 Th	Advanced optimizers, techniques for improving optimization	Optimizer visualization , Chapter 11 of Dive into Deep Learning, Zhang et al.	Lecture 6, Lecture 6 (annotated)
4/19 Tu	Techniques for improving optimization, optimization landscape	He et al. on Kaiming initialization, blog of escaping saddle points, blog on how to escape saddle points efficiently	Lecture 7, Lecture 7 (annotated)
4/21 Th	Optimization landscape (cont'd), global convergence of gradient descent	Du et al. on global convergence of gradient descent	Lecture 8, Lecture 8 (annotated)
	Generalization
4/26 Tu	Finish the proof of global convergence of gradient descent, Neural tangent kernel, measures of generalzation	Jacot et al. on Neural Tangent Kernel, Arora et al. on Neural Tangent Kernel, Zhang et al. on rethinking generalization on deep learning	Proof of global convergence of graident descent (annotated), Lecture 9, Lecture 9 (annotated), scribed notes on global convergence of gradient descent
4/28 Th	Techniques for improving generalization, generalization bounds for deep learning,	Chapter 4 of Dive into Deep Learning, Zhang et al. , Chapter 11 - 13 of Matus Telgarsky's notes	Lecture 10, Lecture 10 (annotated)
5/3 Tu	Norm-based generalization bounds, double descent, implicit bias	Chapter 10,14 of Matus Telgarsky's notes, Jiang et al. on different generalization measures, Belkin et al. on double descent	Lecture 11, Lecture 11 (annotated)
	Neural Network Architecture
5/5 Th	Separation between neural network and kernel, intro to convolutional neural network	Chapter 6,7 of Dive into Deep Learning, Zhang et al.	Lecture 12, Lecture 12 (annotated), scribed notes on separation between NN and kernel
5/10 Tu	Convolutional neural networks, recurrent neural networks	Chapter 6,7, 8 of Dive into Deep Learning, Zhang et al.	Lecture 13, Lecture 13 (annotated)
5/12 Th	LSTM, attention mechanism	Chapter 9, 10 of Dive into Deep Learning, Zhang et al.	Lecture 14, Lecture 14 (annotated)
	Representation Learning
5/17 Tu	Transformer, Desiderata for representation learning, theory for multi-task representation learning	Chapter 10 of Dive into Deep Learning, Zhang et al., Du et al. on representation learning, Tripuraneni et al. on representation learning	Lecture 15 on transformer, Lecture 15 on representation learning, Lecture 15 on transformer (annotated), Lecture 15 on representation learning (annotated)
5/19 Th	Word embedding, auto-encoder, self-supervised learning, contrastive learning	Bengio et al. on representation learning	Lecture 16, Lecture 16 (annotated)
	Generative Models
5/24 Tu	Desiderata for generative models, GAN (on Zoom)	Chapter 17 of Dive into Deep Learning, Zhang et al.	Lecture 17, Lecture 17 (annotated)
5/26 Th	Variational autoencoder, energy models, normalizing flows (on Zoom)		Lecture 18, Lecture 18 (annotated)
	Course Presentations
5/31 Tu	Project Presentation (on Zoom)
6/2 Th	Project Presentation (on Zoom)